Closed Nick-Adams-AU closed 1 year ago
@Nick-Adams-AU are you having connectivity issues ? The cant connect to host errors are weird. This is .. sort of handled, but not in the initial authentication part --- it either works or it doesnt -- if it fails to connect it wont keep retrying.
Once it auths, there is retry and reconnect logic.
Does the aws broker sensor set to off ? In the interim you could use this value to restart the integration automatically, using a service call to home_assistant.reload_config_entry
So I can load the integration and it will stay connected for many hours without issue. When it is "healthy", the "AWS Broker" sensor shows connected. After some indeterminate time, it will move to "disconnected" and it will never reconnect until I either reload the integration or restart HA. If I reload the integration, as you suggested, it comes back up immediately and is fine again for a few hours.
I have a (very?) reliable internet connection and don't have issues with any other web polling integrations. It is possible that my firewall is killing long connections or the odd packet goes missing here or there but generally, my HA integrations are rock solid.
Looking at the errors, the integration doesn't seem to be reconnecting?
I had the same error around that time, seems that the servers got into maintaince mode or something like that, I will work during the weekend to improve the recovery in case of such error.
actually, the recovery worked pretty good and it reconnected once the issue solved, it began at 6:56:31 AM IL TZ, tried to reconnect every 2 minutes, up until 7:18:32 AM which fainlly connected.
Issue caused by vendor outage
Hey,
Thanks so much for this integration! On first connect, it works a treat and is a handy inclusion into HA. Thank you!
I appreciate that this integration is new and still a bit raw. I seem to be having some connection timeout issues. If I reload the integration, the connection seems to work fine for a number of hours but I have noticed that it will eventually die a number of hours later (~5+ hours). The logged errors are below.
Do we need we need either a keepalive or a re-connect on failure check?