Closed dartanidi closed 1 year ago
I forgot to say I'm using your example.yaml and Esphome v2021.12.1
I haven't seen this issue before. My setup is running 10 days++ without issues. I will try to provide some ideas to isolate the cause:
Thanks for replying syssi.
One question: have you tried your setup with the latest ESPhome version (like in my case)? Thanks again for your help.
I'm not able to sniff any packet (except MDNS and broadcast) from/to the bridge using wireshark; I have limited experience in it. Anyway looking at the ESPhome log I found that the streamingserver is reportin some client disconnections and further reconnections with the HA IP. I set the logger to very verbose to check if there is something more.
If your network is switched you can only see broadcasts if you use a third host to sniff traffic. You have to make sure to capture the traffic at the network card of your HA instance (using tcpdump -i eth0 -s0 -w traffic.pcap
). But it looks like this isn't necessary anymore because you already know the root-cause: Connection issues.
I've updated my setup vom 2021.11.4
to esphome 2021.12.2
now. I will report tomorrow it's stable or not.
Update: I flashed the bridge with the custom tasmota firmware from thehelpfulidiot, and seems to be working flawlessly.
If this is the case, looks like the stream-server
component could have issues with the very recent versions of ESPhome.
I will keep you updated if the actual setup will continue working. Thanks anyway for your help, much appreciated!
You are right. I can see multiple reconnects using 2021.12.2
:
[17:21:57][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:21:58][D][streamserver:106]: New client connected from 192.168.1.100
[17:25:37][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:25:37][D][streamserver:106]: New client connected from 192.168.1.100
[17:28:04][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:28:05][D][streamserver:106]: New client connected from 192.168.1.100
[17:35:39][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:35:40][D][streamserver:106]: New client connected from 192.168.1.100
[17:40:00][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:40:00][D][streamserver:106]: New client connected from 192.168.1.100
[17:40:12][D][streamserver:106]: New client connected from 192.168.1.100
[17:42:49][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:42:49][D][streamserver:106]: New client connected from 192.168.1.100
[17:46:59][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:46:59][D][streamserver:106]: New client connected from 192.168.1.100
[17:48:25][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:48:26][D][streamserver:106]: New client connected from 192.168.1.100
[17:50:12][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:50:12][D][streamserver:106]: New client connected from 192.168.1.100
[17:52:08][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:52:09][D][streamserver:106]: New client connected from 192.168.1.100
[17:53:35][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:53:35][D][streamserver:106]: New client connected from 192.168.1.100
[17:57:24][D][streamserver:055]: Client 192.168.1.100 disconnected
[17:57:25][D][streamserver:106]: New client connected from 192.168.1.100
[18:06:11][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:06:11][D][streamserver:106]: New client connected from 192.168.1.100
[18:12:24][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:12:24][D][streamserver:106]: New client connected from 192.168.1.100
[18:22:33][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:22:33][D][streamserver:106]: New client connected from 192.168.1.100
[18:27:34][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:27:35][D][streamserver:106]: New client connected from 192.168.1.100
[18:29:21][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:29:21][D][streamserver:106]: New client connected from 192.168.1.100
I've flashed the device using esphome 2021.11.4
again to verify this version is better for real.
The log message at the esphome node are the same on 2021.11.4
:
[18:45:43][D][streamserver:106]: New client connected from 192.168.1.100
[18:53:17][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:53:17][D][streamserver:106]: New client connected from 192.168.1.100
[18:57:37][D][streamserver:055]: Client 192.168.1.100 disconnected
[18:57:37][D][streamserver:106]: New client connected from 192.168.1.100
[19:23:18][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:23:19][D][streamserver:106]: New client connected from 192.168.1.100
[19:27:18][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:27:19][D][streamserver:106]: New client connected from 192.168.1.100
[19:32:50][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:32:51][D][streamserver:106]: New client connected from 192.168.1.100
[19:36:09][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:36:09][D][streamserver:106]: New client connected from 192.168.1.100
[19:52:46][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:52:47][D][streamserver:106]: New client connected from 192.168.1.100
[19:55:55][D][streamserver:055]: Client 192.168.1.100 disconnected
[19:55:55][D][streamserver:106]: New client connected from 192.168.1.100
[20:04:42][D][streamserver:055]: Client 192.168.1.100 disconnected
[20:04:42][D][streamserver:106]: New client connected from 192.168.1.100
[20:14:50][D][streamserver:055]: Client 192.168.1.100 disconnected
[20:14:50][D][streamserver:106]: New client connected from 192.168.1.100
Could it be possible that the ESPhome directories need to be cleaned before downgrading the files? Otherwise it doesn't have much sense since previously it was working
May be my HA core instance is closing the connection intentionally. I will try to monitor the HA core log as next step.
At the home-assistant.log
I can see the warnings now:
2021-12-31 10:26:01 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 10:26:16 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 10:26:31 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 10:26:46 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 10:27:01 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
I will capture some traffic to get a better understanding who drops the connection.
@dartanidi Do you use ZHA or Z2M now?
I'm using ZHA, btw with the tasmota custom firmware it's working perfectly, no errors are reported in the log.
I still suspect there is an issue within the stream-server
component, I'm reading the issues in oxan's repository and he said the component should have to be rewritten using the new socket
abstraction instead of async_tcp
I assume you are on the right track. Let's see! :-)
2021-12-31 11:57:22 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 11:57:37 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 11:57:52 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 11:58:07 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 11:58:22 WARNING (MainThread) [bellows.zigbee.application] Watchdog heartbeat timeout:
2021-12-31 11:58:24 WARNING (bellows.thread_0) [bellows.uart] Reset future is None
2021-12-31 11:58:24 ERROR (MainThread) [homeassistant] Error doing job: Exception in callback ThreadsafeProxy.__getattr__.<locals>.func_wrapper.<locals>.check_result_wrapper() at /srv/homeassistant/lib/python3.8/site-packages/bellows/thread.py:97
Traceback (most recent call last):
File "/usr/lib/python3.8/asyncio/events.py", line 81, in _run
self._context.run(self._callback, *self._args)
File "/srv/homeassistant/lib/python3.8/site-packages/bellows/thread.py", line 98, in check_result_wrapper
result = call()
File "/srv/homeassistant/lib/python3.8/site-packages/bellows/ezsp/__init__.py", line 258, in frame_received
self._protocol(data)
File "/srv/homeassistant/lib/python3.8/site-packages/bellows/ezsp/protocol.py", line 138, in __call__
frame_name = self.COMMANDS_BY_ID[frame_id][0]
KeyError: 520
Exactly what was happening to me, see first post
In short: I'm able to reproduce the issue now. ;-)
I captured some traffic. It looks like the connection is terminated by the watchdog because of no(?) serial/ezsp traffic.
https://github.com/zigpy/bellows/blob/dev/bellows/zigbee/application.py#L44-L50 https://github.com/zigpy/bellows/blob/dev/bellows/zigbee/application.py#L696-L752
Quick update: It looks like if you downgrade esphome to 2021.9.3
the old stream server implementation is used. This implementation didn't throw any errors in about 6 hours.
Hi,
i'm having this same issue but i am unable to easily rollback to 2021.9.3
. Is there any new solution to this issue?
I was running 2021.12.3
and i was able to recover the 2021.11.1
backup and same issue.
Thanks
It's pretty easy to install an old esphome version if you have access to a linux host. Do you need some instructions?
If you are so kind to provide to me it wold be great!! Thanks!!!!
Btw i can imagine that i have to install the 2021.9.3
version on my linux distro and "update" the coordinator with that version, right?
Yes. Just pick a linux machine, make sure the python package manager pip3
is available. If not please install the package manager. On debian/ubuntu it looks like this: sudo apt install python3-pip
. As soon the python package manager is available you are able to install any esphome version by:
pip3 install esphome==2021.9.3
It's possible the path to the esphome CLI command isn't part of your $PATH
. Just call the command with the full path:
~/.local/bin/esphome version
This should return 2021.9.3
. Now you are able to install/flash your gateway:
~/.local/bin/esphome run your-configuration.yaml
You could upgrade the esphome install later on to the most recent version:
pip3 install esphome -U
But keep in mind to downgrade the version again if you want to update/improve your gateway configuration.
Thank you so much!! It worked!! Now i have to remember to not "Update All" on the esphome dashboard :D Now it's not necessary to reflash the zigbee module, rigth?
Did you flash your ZigBee module ever? It's independent from the ESPHome version. The ZigBee module is equipped with the last flashed firmware.
Did you flash your ZigBee module ever? It's independent from the ESPHome version. The ZigBee module is equipped with the last flashed firmware.
Yes, i followed the entire guide and the coordinator was working but poorly with EspHome 2021.12.3
.
Could you create a new issue first? We should avoid off-topic questions here.
It looks like the issue is gone. This user don't have trouble with a recent esphome version: https://community.home-assistant.io/t/zb-gw03-ewelink-ethernet-zigbee-gateway-now-hacked-with-tasmota-and-esphome-so-can-be-used-via-mqtt-or-as-a-remote-zigbee-lan-adapter-with-home-assistants-zha-integration-or-zigbee2mqtt/341223/59
@syssi oxan has updated the stream server component, now your ESPhome firmware works as expected, much better than the Tasmota one. Now your code can be used as is without any issue. Thanks!
Cool! I will update the project as soon as possible. I would be happy about a pull request too!
Cool! I will update the project as soon as possible. I would be happy about a pull request too!
There is no no need, your yaml files are working out of the box since the component works exactly as the previous version. I'm testing everything using the latest ESPhome version, and no issues since more than one day
@syssi first of all thank you for making this Esphome firmware possible for the zb-gw03.
I moved from a Sonoff Zbbridge where I had sometime some connection stability problems looking for a more reliable solution. Unfortunately this seems totaly unstable, with bellows reporting a lot of watchdog heartbeat timeouts in the log.
It eventually stops working with HA reporting errors like this:
which I read are tipical in the ZB to WiFi bridges. So it looks like a connection problem, but HA PC and the bridge are connected via the same switch, and the bridge is never reported to be unavailable.
Toggling the Zigbee Reset switch solves the issue, but this problems occurs say every hour or two. Any clue? Thanks for your help.