Closed chrisdrackett closed 1 year ago
Have you tried the most recent esp32/esphome firmwares? they have some improvements as far a connection reliability.
nope, I haven't updated it since I got it as everything was working well. I'll give this a go
hm, now it seems worse? It starts, but I get a lot of the following in the logs:
21:36:14 | [D] | [esp-idf:000] | E (69413) esp.emac: no mem for receive buffer
-- | -- | -- | --
21:36:14 | [D] | [esp-idf:000] | E (69415) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69771) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69772) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69784) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69785) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69974) esp.emac: no mem for receive buffer
21:36:15 | [D] | [esp-idf:000] | E (69975) esp.emac: no mem for receive buffer
before I lose the connection entirely and it won't come back till I power cycle.
recommend trying to flash over serial so the esp32 gets a full erase to clear out any old vestiges from the old install
yeah, I did it over serial (OTA didn't seem documented, and I'm nervous enough as it is :P)
ok, I flashed it again just to be safe and it seems to be less noisy at least:
21:38:46 | [D] | [api:102] | Accepted ::FFFF:192.168.1.190
-- | -- | -- | --
21:38:46 | [D] | [api.connection:918] | Home Assistant 2023.1.1 (::FFFF:192.168.1.190): Connected successfully
21:39:02 | [D] | [streamserver:074] | New client connected from 192.168.1.190
21:39:02 | [D] | [binary_sensor:036] | 'TubesZB Serial Connected': Sending state ON
21:40:01 | [D] | [esp-idf:000] | E (85541) esp.emac: emac_esp32_receive(244): received buffer longer than expected
21:40:01 | [D] | [esp-idf:000] | E (85543) esp.emac: emac_esp32_receive(244): received buffer longer than expected
21:40:01 | [D] | [esp-idf:000] | E (85554) esp.emac: emac_esp32_receive(244): received buffer longer than expected
I'll keep an eye on it and see if I have any more disconnects
okay it's late here, so let me know. the esp-idf lines sort of surprise me. but I may just be tired. as I've been testing some idf stuff today too.
Sadly I think its actually worse after the update. In the logs I get something like the following after about 2 minutes of power:
07:41:24 | [D] | [esp-idf:000] | E (61813) esp.emac: emac_esp32_receive(244): received buffer longer than expected
-- | -- | -- | --
07:41:25 | [D] | [esp-idf:000] | E (62643) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:41:25 | [D] | [esp-idf:000] | E (62645) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:41:25 | [D] | [esp-idf:000] | E (62656) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:41:25 | [D] | [esp-idf:000] | E (62837) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:41:25 | [D] | [esp-idf:000] | E (63273) esp.emac: no mem for receive buffer
07:41:25 | [D] | [esp-idf:000] | E (63275) esp.emac: no mem for receive buffer
07:41:25 | [D] | [esp-idf:000] | E (63278) esp.emac: no mem for receive buffer
07:41:25 | [D] | [esp-idf:000] | E (63300) esp.emac: no mem for receive buffer
at which point I lose both the web interface and the connection from Z2M. A restart fixes it, but only for a couple of minutes.
does powering over usb (remove from case) and DO NOT use powered ethernet help it run longer?
I can get a replacement Olimex ESP32-PoE module out to you today. - I think with a new case too as the module is now longer than it was when you purchased.
so I turn off POE on this port and plug in USB?
yes. the usb and poe are not isolated from each other so be sure PoE is off.
I'm seeing the same thing powered by USB. At this point its happening immediately:
07:58:33 | [D] | [api:102] | Accepted ::FFFF:192.168.1.190
-- | -- | -- | --
07:58:33 | [D] | [api.connection:918] | Home Assistant 2023.1.1 (::FFFF:192.168.1.190): Connected successfully
07:58:48 | [D] | [esp-idf:000] | E (25728) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:58:48 | [D] | [esp-idf:000] | E (25730) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:58:48 | [D] | [esp-idf:000] | E (25741) esp.emac: emac_esp32_receive(244): received buffer longer than expected
07:58:50 | [D] | [esp-idf:000] | E (27638) esp.emac: no mem for receive buffer
07:58:50 | [D] | [esp-idf:000] | E (27642) esp.emac: no mem for receive buffer
07:58:50 | [D] | [esp-idf:000] | E (27647) esp.emac: no mem for receive buffer
07:58:50 | [D] | [esp-idf:000] | E (27663) esp.emac: no mem for receive buffer
I think there is something up with the hardware in the esp. I will get a new one out today. can you just confirm address is same as original order?
in the mean time if you have an USB to serial converter you can wire up the zigbee module to serial to keep you online. the pins are labeled on the underside of the PCB let me know if you need any clarification.
Yep, address is the same! I don't have a converter, but I do have my old coordinator somewhere still. I think in theory I need to change its ieee or something? I'll look it up…
z2m should be able to restore current network to it...just not sure on the conditions it uses. just make a copy of the backup in the zigbee2mqtt folder before doing anything ;)
man, now I'm just getting garbage when connected via esphome-flasher :\
Serial port closed!
[08:48:28][3m[e22]nting vitt f! Re-connecting...
[08:48:43]ethernet:122Cocting via ettnnectg...0]ht:1: a[3toetn..[m
Serial port closed!
[08:48:43]ethernet:122Cocting via ettnnectg...0]ht:1: a[3toetn..[m
[08:48:58]3WtConeca e;3[[ent12:Cnei i tentfie!R-ncig.0m[W[ehret12:Connecngvia etret failed! Re-con[0m
[08:49:13]ten:2] ti hennecting...3mrnt12] oncti hre aecting...
[08:49:28][0;3[W][ethee:2:Cneng viaetentfiled! Re-conn[03mW[ Re-con.
Serial port closed!
[08:49:28][0;3[W][ethee:2:Cneng viaetentfiled! Re-conn[03mW[ Re-con.
never seen that. it either works on not in the cases where they fail in my experience.
hm, FYI (and I'm not sure why I did this) but I updated the firmware on my CC2652P2 based Zigbee to Ethernet/USB Serial Coordinator
before using it and I'm seeing the same no mem for receive buffer
in my logs.
I just rolled back to tube-zb-gw-cc2652p2-v2_2021_09_17.bin
and I'm not seeing the odd lines in my log. 🤞I don't see the other issues either!
ok, nevermind, just got E (40079) esp.emac: no mem for receive buffer
again. Could something else be going on? Is this a HASS or Z2M issue after all?
very strange been usign the new FW for months now with never seen anything like this.
I compiled a new Dev FW right now, that is based on the -IDF framework can you give this a try on the PoE module? flash over serial. tubeszb-cc2652-poe-2022-idf-20230203.bin.zip
they are showing in the esphome log as I understand it, so definitely think it's focused there. do you have a different switch you could try - did anything change on the switch side recently?
I recently got an aggregation switch to reduce daisy chaining, but the coordinator has been plugged into the same switch and port the entire time I've owned it.
getting the same thing. I'm racking my brain but can't think of anything that has changed, especially in the last day or so when it went from dying every week to dying within hours/minutes/seconds.
ok, I was wrong, I do have Z2M set to auto-update (something I generally try to avoid). My current guess is something changed there, because everything else hasn't changed in months.
I would be hearing from A LOT of people if the z2m update caused issues. I'm leaning toward Network/Power or hardware. I'll try and do some more research on my side as well.
True, just trying to figure out what changed. I'm concerned because I see the same thing on two pieces of hardware, which makes me think its something on my end :(
there is a fix for this or a similar error that got merged into ESP-IDF in late November. the version used in esphome is currently before this. and I'm not really sure how the versioning in IDF tracks. but I'm attempting to build with the "Latest" to see what happens.
ok, I have the POE/USB coordinator plugged into my HASS box via USB at the moment to see what happens. I'm not sure how to see logs from the coordinator in this case (if it is even possible) but I'll keep an eye on Z2M to see how long it stays up
If zigbee running over USB the ESP32 plays no role.
do you still want me to ship out the new ESP32-PoE board? maybe worth having on hand to try.
If you think it would help debug. Because I've seen the same thing on 2 different coordinators I'm not convinced that new hardware will make a difference, but I also have 0 idea what is going on or how these things work :D
No problem.
so the ESP-IDF just had a bug fix release(4.4.4) earlier today in the version Esphome uses(4.4.2) and it's the first since that issue I saw had a merge so the tool chains aren't even all updated yet to support it to see if makes a difference. but iI'll keep an eye on it.
sounds good, I'm still super confused why I just started seeing this recently, so who knows!
@tube0013 got the new board, do I just move the zigbee module over? do I need to do anything else to make sure I don't break anything on my zigbee network?
Just move it over, you will need to update the ip in your config or change the dns reservation for the new board.
@tube0013 thanks, unfortunately this one is having the same issue. I can only assume this is something on my network / the issue you linked. I'm in no way a networking expert, but seems like something on my network is sending more data / faster data than the ESP can handle?
Trying fixing the network port in the switch to 100Mb full duplex.
Also what kind of switch are you using?
its a unifi switch (US-8-150W) forcing it to 100Mb full duplex does not change anything (that already how it connects via the default "auto" setting.
I assume that the coordinator is picking up on some sort of traffic on my network as I don't even currently have this device setup to connect to Z2M and its still crashing almost immediately.
okay this is just really odd. I do all my testing with unifi switches (and router). do you want to try a static IP build?
I'm happy to do anything to help debug! I'm currently trying to figure out how to setup port mirroring and wireshark, but I'll probably have little idea what I'm seeing :P
grab the yaml from here:
and drop it in esphome folder (if you don't have it let me know)
uncomment the static ip stuff and fill it out ( send me the ip and gateway ip if you don't have esphome).
ok, I just setup a network with multicast turned off and added the coordintator to that network. So far it's been up for minutes, so my current guess is that something apple is doing with multicast and the culprit (I don't have any other devices on my network that use multicast that I'm aware of)
I have multicast and IGMP enabled. as I have all my google homes on a separate vlan otherwise they have issues (at least that's my memory of why I did it).
still up without any log output on the device. I think we (kinda) found the issue. I'm reading up on multicast, but that isn't your issue ;)
do you want me to mail this module back?
yeah, did I include a return envelope/label?
I've been trying to debug this issue for a couple weeks. I started with home assistant and Z2M, but I've been slowly working my way down the line to the point that I think the issue is my
CC2652P2 Based Zigbee to PoE Coordinator 2022
. It seems like it is going offline at random intervals, and if its offline for long enough (or some other timing works out) Z2M will crash and not auto-restart.The only way of debugging I currently can think of is pinging the IP of the coordinator. I see the following at times:
so we're only talking a couple seconds at a time. I've tried using different routers and I'm seeing the same on all of them.
Let me know if there is something else I can do to debug.