Closed Skaronator closed 3 years ago
why are you talking about mDNS?
I believe 3 hours and 20 minutes is key here. Good finding!
I found this somewhat related https://github.com/knolleary/pubsubclient/issues/624
For me it looks like that mDNS stopped working since Home Assistant can no longer resolve the .local
domain.
From my understanding to resolve the led_controller_1.local
DNS the Home Assistant machine sends out a multicast package to the 224.0.0.251 address and in best case the device with the matching hostname response with it's own IP, which would be the ESP32 in this case. Since the HASS machine can't resolve the .local
domain while still being able to ping the device over the direct IP is my guess that mDNS stopped working. Or more precisely the ESP32 doesn't respond to the mutlicast packages.
You might want to try adding the device with a fixed up address. This way we know for sure If it is related to mdns
Yup I've just re-added the device to home assistant via the IP. Let's see how it runs overnight.
Home Assistant should need to resolve it as long as it stays connected. The fact that it's trying to resolve it means it got disconnected and is trying to reconnect, but can't resolve the address because the node is mid-reboot and therefore not available to respond to mdns requests.
I'm guessing that there's some other issue causing connectivity to fail after a fixed timespan. The mdns issue is a symptom of the node dropping off the network and then rebooting when it has no API client; not the cause. The fixed timespan is a good clue though.
Home Assistant should need to resolve it as long as it stays connected. The fact that it's trying to resolve it means it got disconnected and is trying to reconnect, but can't resolve the address because the node is mid-reboot and therefore not available to respond to mdns requests.
I setup a ping sensor in Home Assistant which pings the ESP32 over the IP. When mdns goes down the Home Assistant Server is still able to ping the ESP32 over the IP so it seems like that it is still online as shown here:
After switching to the direct IP in Home Assistant (instead of using the mDNS name) things got even more interesting and now I'm confused as well. As you can see the reboots are still happening
When I checked Home Assistant Log I can see a short reconnect. Nothing more:
2020-05-06 07:47:33 INFO (MainThread) [homeassistant.components.esphome] Disconnected from ESPHome API for 192.168.178.35
2020-05-06 07:47:35 INFO (MainThread) [homeassistant.components.esphome] Successfully connected to 192.168.178.35
2020-05-06 11:08:06 INFO (MainThread) [homeassistant.components.esphome] Disconnected from ESPHome API for 192.168.178.35
2020-05-06 11:08:07 INFO (MainThread) [homeassistant.components.esphome] Successfully connected to 192.168.178.35
2020-05-06 14:28:38 INFO (MainThread) [homeassistant.components.esphome] Disconnected from ESPHome API for 192.168.178.35
2020-05-06 14:28:39 INFO (MainThread) [homeassistant.components.esphome] Successfully connected to 192.168.178.35
Looking at the uptimes we can see the 3 hours and 20 minutes again:
01:06-04:27 = 3h21m
04:27-07:47 = 3h20m
07:47-11:08 = 3h21m
11:08-14:28 = 3h20m
2 things that I'll do:
I've connected the serial monitor to the device and check if the usb logs shows something different now that it is connected via the direct IP.
Additionally I've ordered 2 different PCB designs which I'll be using with a ESP32-WROOM-32D to rule out that it is a hardware issue. Since shipping is currently not that reliable it might take a few days longer than the usually 10 days.
I managed to capture the device logs over usb and actually surprisingly it doesn't reboot the device! So no matter if I use mDNS or the direct IP after exactly 3 hours and 20 minutes the Home Assistant disconnects for whatever reason. After the disconnect Home Assistant immediately try to reconnect to the ESP32. When using mDNS it can't resolve the ESP32 host name anymore and therefor will never reconnect but when using the IP it can connect right away.
The big question is why does this ESP32 disconnect after the exact same amount of time while all other ESP8266 based devices continue to work fine on the same network.
[VV][api.service:032]: send_ping_response: PingResponse {}
[VV][api.service:220]: on_ping_request: PingRequest {}
[VV][api.service:032]: send_ping_response: PingResponse {}
[VV][api.service:220]: on_ping_request: PingRequest {}
[VV][api.service:032]: send_ping_response: PingResponse {}
[D][api:067]: Disconnecting Home Assistant 0.109.4 (192.168.178.5)
[VV][api.service:192]: on_hello_request: HelloRequest {
client_info: 'Home Assistant 0.109.4'
}
[V][api.connection:583]: Hello from client: 'Home Assistant 0.109.4 (192.168.178.5)'
[VV][api.service:012]: send_hello_response: HelloResponse {
api_version_major: 1
api_version_minor: 3
server_info: 'led_controller_1 (esphome v1.15.0-dev)'
}
[VV][api.service:199]: on_connect_request: ConnectRequest {
password: 'WzUv9iJtX7z3Ubf2ddysTwzPeLrvb4MLRtvvkZK'
}
[D][api.connection:599]: Client 'Home Assistant 0.109.4 (192.168.178.5)' connected successfully!
[VV][api.service:016]: send_connect_response: ConnectResponse {
invalid_password: NO
}
[VV][api.service:234]: on_device_info_request: DeviceInfoRequest {}
[VV][api.service:036]: send_device_info_response: DeviceInfoResponse {
uses_password: YES
name: 'led_controller_1'
mac_address: '84:0D:8E:D0:9E:EC'
esphome_version: '1.15.0-dev'
compilation_time: 'May 4 2020, 19:11:17'
model: 'NodeMCU-32S'
has_deep_sleep: NO
}
[VV][api.service:241]: on_list_entities_request: ListEntitiesRequest {}
[VV][api.service:085]: send_list_entities_light_response: ListEntitiesLightResponse {
object_id: 'custom_one'
key: 415191921
name: 'Custom One'
unique_id: 'led_controller_1lightcustom_one'
supports_brightness: YES
supports_rgb: NO
supports_white_value: NO
supports_color_temperature: YES
min_mireds: 200
max_mireds: 500
}
[VV][api.service:085]: send_list_entities_light_response: ListEntitiesLightResponse {
object_id: 'custom_one_rgb'
key: 443735533
name: 'Custom One RGB'
unique_id: 'led_controller_1lightcustom_one_rgb'
supports_brightness: YES
supports_rgb: YES
supports_white_value: NO
supports_color_temperature: NO
min_mireds: 0
max_mireds: 0
}
[VV][api.service:040]: send_list_entities_done_response: ListEntitiesDoneResponse {}
[VV][api.service:248]: on_subscribe_states_request: SubscribeStatesRequest {}
[VV][api.service:298]: on_subscribe_homeassistant_services_request: SubscribeHomeassistantServicesRequest {}
[VV][api.service:319]: on_subscribe_home_assistant_states_request: SubscribeHomeAssistantStatesRequest {}
[VV][api.service:091]: send_light_state_response: LightStateResponse {
key: 415191921
state: NO
brightness: 1
red: 0
green: 0
blue: 0
white: 0
color_temperature: 322
effect: ''
}
[VV][api.service:091]: send_light_state_response: LightStateResponse {
key: 443735533
state: NO
brightness: 1
red: 0
green: 1
blue: 0
white: 0
color_temperature: 0
effect: ''
}
[VV][api.service:220]: on_ping_request: PingRequest {}
[VV][api.service:032]: send_ping_response: PingResponse {}
[VV][api.service:220]: on_ping_request: PingRequest {}
[VV][api.service:032]: send_ping_response: PingResponse {}
[VV][api.service:220]: on_ping_request: PingRequest {}
[VV][api.service:032]: send_ping_response: PingResponse {}
2020-05-06 21:09:42 INFO (MainThread) [homeassistant.components.esphome] Disconnected from ESPHome API for 192.168.178.35
2020-05-06 21:09:43 INFO (MainThread) [homeassistant.components.esphome] Successfully connected to 192.168.178.35
(There is not much more :/)
Unfortunately I have no explanation but I seem to be in the same "200min" club. In my case I have a ESP32 NodeMCU which generates sensor data. Every 30s I store the data to an sd-card and also send it via wifi to a Rpi sql database. 3h20min after start, the remote database is no longer updated. Serial output on the ESP states the following error after attempting the sql-query: Class requires connected server. The rest of the ESP keeps running fine. No reboots, no skips in data logged to the sd-card.
I have only discovered this issue yesterday so I have not tried any solutions to keep the connection alive. Of course this is also difficult to troubleshoot when each test requires 200min...
side note: I also have a different rpi running home assistant in my network but that surely is a coincidence and not a part of the problem.
Can you both post hardware environment? WiFi brand and hass hardware and installation method
Do you have this issue with many esp32?
Thanks
like this?
ESP32 NodeMCU (WROOM-32) Arduino IDE 1.8.12 esp board manager 1.0.4
Rpi 4 Model B 1gb Raspbian 10.3.22-MariaDB-0+deb10u1 phpMyAdmin 4.6.6deb5
Wifi: RPi 4 connected to Fritzbox 7590 router via wifi ESP32 connected to UniFi AP AC lite via wifi router connected to AP via powerline adaptor
my hass installation is on a separate Rpi 3 and should not matter here
@peterbartl are you using esphome with the api:
right?
Sorry, I should have been clearer in the beginning. No, right now I am not using esphome at all. I found this thread with a google search for "esp wifi 3hours 20min". I was just trying to mention that I observe the same "magic" timeout after 200mins with an ESP32 connected to a RPi via wifi. And my setup has no home assistant and no esphome in it. So the problem is most likely on the esp32 side.
I may very well use esphome in the future as I already have a separate RPi with home assistant and am always playing with ESPs.
I should have added my info to this espressif topic which has already been cross linked here: https://github.com/espressif/arduino-esp32/issues/3986
Wifi: RPi 4 connected to Fritzbox 7590 router via wifi ESP32 connected to UniFi AP AC lite via wifi router connected to AP via powerline adaptor
Actually quite similar to my setup. I have a Fritzbox 7590 as router and use Ubiquiti NanoHD as APs where the ESPs connect to but my Home Assistant Server is hardwired to the network.
I did some more troubleshooting. My new PCB arrived which have a completely different layout and power rail design but unfortunately the issue persist as you can see here:
I've also bought a ESP32 do it DevKit to eliminate other possible issues but I got the same result:
And lastly to rule out that the Ubiquiti Access Points cause this issue (even though the ESP8266 runs fine on them) I had setup the WiFi on my Fritzbox and connected the ESP32 to it but sadly got the same result :(
To summarized all things I've tried:
I have the same experience with ESP32-WROOM-32 on that development board similar to the ESP8266 NodeMCU. I can't deliver so much detail. The ESP32 connect to WiFi , talk with a Raspi via Codesys , Modbus TCP/IP and works fine until the said 3h ++. I did not measuring the exact duration. The same sensor setting and modbus TCP/IP setting but with an ESP8266 NodeMcu and the ESP8266WiFi.h works fine long time. Additional installed is the ElegantOTA. Even this works fine beside the modbus TCP/IP
Additionally I have a ESP8266EX for this relay module ESP-01. This can't connect at all to WiFi. It load the program and execute i.e. Blink (but with the Built_in LED on GPIO1 , not 2!) via an USB Stick. I use the same ESP8266WiFi.h as for the NodeMCU
From both ESP32/ESP8266EX I have 10 pieces, but from the same order. So, I guess chips are from the same batch. There is no difference.
There are so many different libraries for WiFi out here on Github. All say it works suddenly, but don't explain well what happend. Thank You and Best Regards Michael
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Is there no solution for this problem? It still is pending. Also no answer from Espressiff about that.
FYI, I had the same problem, setup a little different (nodemcu esp32, micropython firmware). Problem is there with firmware from 2019-12-20, simply gone after update to recent firmware 2020-09-02.
Same problem here.
I run Hass.io and the problem exists with two devices:
Note: The (very small) disconnect I get is not synchronized between the two devices (i.e. it does not happen at the same time for both).
Possible common denominator: I use a Fritzbox 7530. Both devices have fixed IPs assigned from the router.
POSSIBLE solution: I set a static IP (hardcoded) on the esp32. It is now free of disconnects for almost a day.
Maybe it is an issue of the DHCP server of the Fritzbox? I will try to use another DHCP server at some point to confirm.
@Stamatiou, I don't think the problem is with the WiFi router. I tried several different type and even fixed IP. I see that some go for microphyton and use a newer firmware. I actually use the Arduino IDE, but there is no newer firmware. It happen for me with the ESP32 nodemcu, ESP32Cam and tried with a handfull different controllers. Seems, I should try also this microphyton. Does anyone have a good recommendation for an easy IDE.? It was hard enough to get C running after about 40y++. :D. Have to learn Phyton new :O
@HaegarDerWikinger yeah probably not, since I have many other devices that don't have this issue. Two other users on this thread have the same (more or less) router though so I think it's worth a mention.
Just to report that I suffer from the very some problem with this setup:
LWT mqtt topic is recorded and I can see a very quick disconnection (~1 sec) every 3h20min. As per @XStamatiou suggestion, I am now trying two other things:
Will report results in a few days.
After 9 hours, I confirm that:
Thus, there is something related to an ESP32 which uses dhcp to get an ip address. The two "workarounds" above is what I need to proceed.
Description seems same as #910
After 9 hours, I confirm that:
- same code run on an esp8266 WORKS
- esp32 now has a static ip (defined in the code itself, not talking about assigned ip on the router) WORKS
@mr2c12 Thus, there is something related to an ESP32 which uses dhcp to get an ip address. The two "workarounds" above is what I need to proceed. Which IDE you use. I use the Arduino platform The same code of ESP32 NodeMCU works on ESP8266 NodeMCU except the Wifi Library.
For ESP32, I use the WiFi.h and for ESP8266 I use ESP8266WiFi.h May I can use the ESP8266WiFi.h also for the ESP32 ? Best Regards
Maybe this problem is really related to the router (I have a FritzBox 7590). I have several esp32 and Yeelight lamps and all have a problem with 200 minutes reconnect in Home Assistant, or is it a problem of Home Assistant.
Thus, there is something related to an ESP32 which uses dhcp to get an ip address. The two "workarounds" above is what I need to proceed.
Today I've tested it with an wesp32-board over the LAN-interface. When using a static IP and no DHCP it works for more than 3 hours and 20 minutes. After 10 hours I shut down the test.
I think it has nothing to do with wifi or ethernet LAN or the router. I think DHCP ist the problem. But only on ESP32. Same code on ESP8266 runs for days without any problems.
On the ESP32/ESP8266 I use a server implementation based on asyncServer with an permanently opened TCP-socket. So I can easy detect, if the socket ist broken and has to reconnect. Best Regards
@Sakronator You wrote you use the same code for both uCtrl. Do you use also the same WiFi library? For ESP32, I use the WiFi.h and for ESP8266 I use ESP8266WiFi.h May I can use the ESP8266WiFi.h also for the ESP32 ? Best Regards Michael
I can confirm that the workaround using a static address and fast connect works for me on Olimex ESP-POE-ISO (ESP32 Ethernet) on the LAN interface since days. If I switch back to DHCP I see every 200 minutes the disconnect.
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.
Operating environment/Installation (Hass.io/Docker/pip/etc.):
ESP (ESP32/ESP8266, Board/Sonoff):
It's a ESP32-WROOM-32 on a custom PCB which is connected to 5 MOSFETs and 1 Dallas Temperature Sensor.
Affected component:
mDNS / WiFi: https://esphome.io/components/wifi.html
Description of problem:
My ESP32 is losing every 3 hours and 20 min the connection to Home Assistant and after 15 min it'll automatically reboot (due to reboot_timeout).
After a suggestion from @glmnet I added a Ping in Home assistant to monitor if the device loses it's connection but that's not the case. Ping via direct IP keeps working (until the device reboots and is down for a few seconds)
Here are the time from each uptime:
As you can see there are always 3 hours and 20 minutes of uptime, so I suspect there is some kind of memory leak which breaks the mDNS after that amount of time.
Also worth mentioning that this issue does not happen on my 5 ESP8266 devices but that's expected since both use completely different mDNS implementations.
Problem-relevant YAML-configuration entries:
Logs (if applicable):
Logs from Home Assistant
This are just the logs for the latest event. As you can see Home Assistant can no longer resolve the mDNS Hostname which means that the device doesn't respond to the multicast UDP packets.
Additional information and things you've tried:
power_save_mode: none
in WiFi (this helped a bit I think)