Closed maxim-le closed 2 years ago
esphome documentation esphome source (message by IssueLinks)
Hey there @ottowinter, @jesserockz, mind taking a look at this issue as it has been labeled with an integration (esphome
) you are listed as a code owner for? Thanks!
(message by CodeOwnersMention)
Hi, yeah I've seen this before too, but it's really hard to reproduce unfortunately (for me happens only once a month or so)
If you have a setup that reproduces this relatively frequently it would be great to get some verbose logs to try to see what might be happening.
In your Home Assistant (!!) configuration.yaml, put
logger:
default: info
logs:
homeassistant.components.esphome: debug
aioesphomeapi.reconnect_logic: debug
aioesphomeapi._frame_helper: debug
aioesphomeapi.connection: debug
aioesphomeapi.host_resolver: debug
And post the debug logs of around the time of the last connect attempt
Note: those settings make HA debug all esphome traffic, so might be good to first check if there's something sensitive in there and if so filter it out.
Thanks!
Sure thing! I'll set this up and post the logs here when it'll happen again :)
I see these disconnects way more often than once a month or week, so I will also provide logs.
Like the OP said, when I look at the logs via the web interface, the device connects without any problems, but HA still shows entities as "Unavailable"...
I got those as well every 12-14 hours or so. Setting up the debug of esphome in HA too.
For what I could observe so far, it affect only esp boards where noise encryption is set up (half of my fleet).
Here's a bit of a log file showing disconnects. I also have (all my) nodes on Noise, like @RobinMarsollier.
Some more logs (10 min, including the moment of the disconnect), disconnects happens at 21:03:42 for bedroom_sensors (192.168.0.28), and from this point on, no trace of it on the logs except for this line a few minutes later :
2022-02-08 21:14:39 DEBUG (MainThread) [aioesphomeapi.reconnect_logic] bedroom_sensors @ 192.168.0.28: Triggering reconnect because of received mDNS record record[ptr,in,_esphomelib._tcp.local.]=4500/4499,bedroom_sensors._esphomelib._tcp.local.
Thanks for the logs guys/gals!
Using the logs I was able to identify one real problem, when it's merged perhaps you can check if https://github.com/home-assistant/core/pull/66189 fixes the issue for you.
Specifically (from the logs from @RobinMarsollier) here's an example.
First the device disconnects, and a reconnect is immediately started
2022-02-08 21:03:42 INFO (MainThread) [aioesphomeapi.reconnect_logic] Disconnected from ESPHome API for bedroom_sensors @ 192.168.0.28
2022-02-08 21:03:42 DEBUG (MainThread) [aioesphomeapi.connection] bedroom_sensors @ 192.168.0.28: Socket closed, stopping read loop
2022-02-08 21:03:42 DEBUG (MainThread) [aioesphomeapi.connection] bedroom_sensors @ 192.168.0.28: Connecting to 192.168.0.28:6053 (AddrInfo(family=<AddressFamily.AF_INET: 2>, type=<SocketKind.SOCK_STREAM: 1>, proto=6, sockaddr=IPv4Sockaddr(address='192.168.0.28', port=6053)))
That socket is opened successfully and a TCP connection established, as we can see by this later message
2022-02-08 21:03:49 DEBUG (MainThread) [aioesphomeapi.connection] 192.168.0.28: Opened socket for
https://github.com/esphome/aioesphomeapi/blob/v10.8.1/aioesphomeapi/connection.py#L167
Next is initializing the frame helper and performing the handshake, and indeed we can see an (empty) ClientHello handshake frame being transmitted in the next log line
2022-02-08 21:03:49 DEBUG (MainThread) [aioesphomeapi._frame_helper] Sending frame
https://github.com/esphome/aioesphomeapi/blob/v10.8.1/aioesphomeapi/_frame_helper.py#L148
Now the logging here isn't great (unable to see which Received frame
corresponds to which connection), but I think what's happening is the connection is silently closed (for example by the device dropping off the network at just the right time), and so the read_frame to get the server's response just hangs forever.
That is a bug and is fixed by https://github.com/home-assistant/core/pull/66189, but I can't say with certainty if it's the issue you're seeing.
I'll wait for it to be released to test it (winter here, I rely on HA quite a lot, i can't to hack too much right now). If you need to expand the logging to track it down more precisely, i'd be happy to follow-up on it here tho.
If I make those changes in the 3 files listed in the commit, will the updated component automatically be downloaded, or do I need to download this manually?
Ok, found the changes in the https://github.com/esphome/aioesphomeapi/pull/176/files PR, applied those in root@a0d7b954-esphome:/usr/local/lib/python3.9/dist-packages/aioesphomeapi
, cleared the __pycache__
for those 2 files, restarted HA.
In the ESPHome dashboard, all devices I expect to be online are online, and I'll check if any of them fall out. I have a dashboard showinf exactly the 4 sensors I need for that :-)
Looks like it did the trick on my end, no more disconnected board on the past days.
I have seen sporadic disconnects, but they almost always reconnected within some dozens (tens? Not sure if that's a correct sentence...) of seconds.
Updated to 2022.2.6 a few days ago and no problems since! :) Thanks for all the help! I'll keep you posted if I experience any more disconnects
Ok, as this issue appears to be fixed I'll close it. Note: with "the issue" I mean HA stopping reconnecting at all after a while (not sporadic disconnects, these can be for any number of reasons, like wifi coverage etc)
The problem
Over the past few months I have been experiencing an issue where HA would disconnect from an ESPHome device and not try to reconnect. I included below some of the logs I am seeing. This issue is very statistical, I saw it happen once a month and I saw it happen a couple of times a week (even once a day).
Even more strange is the fact that none of these logs are at the time of disconnection (in the example below - the ESPHome device is disconnected from HA since 1PM and these logs are from around 10AM and there are no logs regarding ESPHome from around 1PM)
The ESPHome device itself is connected to the WiFi network and is accessible via the web interface. I am also using an API password and encryption key, a static IP, power save mode NONE and other recommended features enabled that should have fixed this issue. (see YAML example below).
I tried to reset the ESPHome device and update its ESPHome version - but it did not solve the issue (HA is still disconnected). However, when I reload the ESPHome integration - it all works (until it disconnects again). This is what led me to believe that the issue is probably in HA and not in ESPHome.
If there's anything else I can provide to help debug this issue, please let me know. Thank you in advance!
What version of Home Assistant Core has the issue?
core-2022.2.0
What was the last working version of Home Assistant Core?
No response
What type of installation are you running?
Home Assistant OS
Integration causing the issue
ESPHome
Link to integration documentation on our website
https://www.home-assistant.io/integrations/esphome/
Example YAML snippet
Anything in the logs that might be useful for us?
Additional information
ESPHome integration Diagnostics