espressif / esp-mdf

Espressif Mesh Development Framework, limited maintain, recommend to use https://github.com/espressif/esp-mesh-lite
Other
779 stars 253 forks source link

wifi not reconnecting after being disconnected for a long time #236

Open will-emmerson opened 3 years ago

will-emmerson commented 3 years ago

Environment

Problem Description

I have a mesh root device which is using code similar to the mwifi example in sdk. When the wifi router is turned off, left for a few minutes then turned on again, the device reconnects ok:

I (1985) [EVENT_LOOP, 28]: MWIFI_STARTED
W (5895) wifi:<MESH AP>adjust channel:1, secondary channel offset:1(40U)
I (5895) [EVENT_LOOP, 155]: MWIFI_FIND_NETWORK [mac=0000, channel=2]
W (5905) wifi:<MESH AP>adjust channel:2, secondary channel offset:1(40U)
I (9025) [EVENT_LOOP, 47]: MWIFI_PARENT_CONNECTED [mac=07FC, channel=2, type=1, layer=1, rssi=-43, ssid=TP-Link_07FC]
I (9035) [EVENT_LOOP, 142]: MWIFI_TODS_STATE [connected=false]
I (9035) [EVENT_LOOP, 148]: MWIFI_ROOT_ADDRESS [mac=A9C5]
W (9035) wifi:<ba-add>idx:0 (ifx:0, 00:31:92:f3:07:fc), tid:0, ssn:2, winSize:64
I (00:00:09.279) esp_netif_handlers: sta ip: 192.168.0.100, mask: 255.255.255.0, gw: 192.168.0.1
I (9845) [EVENT_LOOP, 98]: MWIFI_ROOT_GOT_IP [ip=192.168.0.100]
I (16:46:11.066) SNTP: Got time via NTP (2021-08-17T16:46:11+0100), waiting 60 minutes
I (12455) [EVENT_LOOP, 142]: MWIFI_TODS_STATE [connected=true]
I (16:46:12.626) MQTT: MQTT Connected, subscribing to send/240ac425a9c4

<router turned off>

W (273585) wifi:<ba-del>idx
E (16:50:33.764) TRANS_SSL: ssl_poll_read select error 113, errno = Software caused connection abort, fd = 54
E (16:50:33.767) MQTT_CLIENT: Poll read error: 0, aborting connection
I (273595) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=07FC, ssid=TP-Link_07FC]
I (273605) [EVENT_LOOP, 142]: MWIFI_TODS_STATE [connected=false]
I (275925) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
I (278255) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
I (280585) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
I (343525) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
...
E (16:51:43.783) esp-tls: Failed to connnect to host (errno 113)
E (16:51:43.783) esp-tls: Failed to open new connection
E (16:51:43.784) TRANS_SSL: Failed to open a new connection
E (16:51:43.791) TRANSPORT_WS: Error connecting to host <redacted>:443
E (16:51:43.799) MQTT_CLIENT: Error transport connect
I (345855) [EVENT_LOOP, 72]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
...

< router turned on >

I (354935) [EVENT_LOOP, 47]: MWIFI_PARENT_CONNECTED [mac=07FC, channel=9, type=1, layer=1, rssi=-39, ssid=TP-Link_07FC]
I (354935) [EVENT_LOOP, 142]: MWIFI_TODS_STATE [connected=false]
I (354945) [EVENT_LOOP, 148]: MWIFI_ROOT_ADDRESS [mac=A9C5]
W (358645) wifi:!<ba-add>idx:0 (ifx:0, 00:31:92:f3:07:fc), tid:0, ssn:2, winSize:64!
I (16:52:01.711) MQTT: MQTT Connected, subscribing to send/240ac425a9c4
I (361545) [EVENT_LOOP, 142]: MWIFI_TODS_STATE [connected=true]

But when the wifi router is turn off for 24 hours or so then turned back on, the mesh device doesn't connect properly:

I (90604030) [EVENT_LOOP, 67]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
I (90606970) [EVENT_LOOP, 67]: MWIFI_PARENT_DISCONNECTED [mac=0000, ssid=TP-Link_07FC]
...

< router turned on >

W (90607280) [EVENT_LOOP, 116]: MWIFI_CHANNEL_SWITCH [channel=2]
W (90610320) wifi:!Haven't to connect to a suitable AP now!!
W (90623360) wifi:!Next TBTT incorrect! last beacon:411933224, offset:74375, next beacon:422378024, beacon interval:102400, dtim period:0, dtim count:0, listen interval:3, now:428086668!
I (90623380) [EVENT_LOOP, 46]: MWIFI_PARENT_CONNECTED [mac=07FC, channel=2, type=1, layer=1, rssi=-42, ssid=TP-Link_07FC]
I (90623380) [EVENT_LOOP, 139]: MWIFI_TODS_STATE [connected=false]
I (90623390) [EVENT_LOOP, 145]: MWIFI_ROOT_ADDRESS [mac=A9C5]

It appears to reconnect but it doesn't seem to be getting an ip address. Is there some kind of timeout which is longer than 1-2 hours but less than 24 hours?

There are two other non-mesh esp32 devices running idf v4.3 which do reconnect after 24 hours of the wifi router being down so it seems to be something specific to esp-mdf and not anything to do with the router itself.

will-emmerson commented 3 years ago

I realised some of the logs are set to WARN only, here are the logs again of it not connecting properly, if that helps:

I (66715533) wifi:<connect>csa, newchan=6, old=10, csa_count:15
W (66715543) [EVENT_LOOP, 120]: MWIFI_CHANNEL_SWITCH [channel=6]
W (66718323) wifi:Haven't to connect to a suitable AP now!
I (66731543) wifi:switch to channel 6
I (66731543) wifi:ap channel adjust o:10,2 n:6,2
I (66731543) wifi:new:<6,2>, old:<10,2>, ap:<6,2>, sta:<10,2>, prof:10
I (66731543) wifi:new:<6,2>, old:<6,2>, ap:<6,2>, sta:<6,2>, prof:10
I (66731553) wifi:state: init -> auth (b0)
I (66731573) wifi:state: auth -> assoc (0)
I (66731613) wifi:state: assoc -> run (10)
I (66731623) wifi:connected with TP-Link_07FC, aid = 1, channel 6, 40D, bssid = 00:31:92:f3:07:fc
I (66731623) wifi:security: WPA2-PSK, phy: bgn, rssi: -29
I (66731623) wifi:pm start, type: 0
W (66731633) wifi:Next TBTT incorrect! last beacon:2289794917, offset:52481, next beacon:2300239717, beacon interval:102400, dtim period:0, dtim count:0, listen interval:3, now:2305943390
I (66731643) mesh: [scan]new scanning time:600ms, beacon interval:300ms
I (66731663) mesh: <flush>upstream packets, connections(max):6, waiting:0, upQ:0
I (66731663) [mwifi, 175]: Parent is connected
I (66731663) wifi:I (66731663) mesh: <flush>root AP's beacon interval = 102400 us, DTIM period = 1
I (66731663) mesh: [TXQ]<max:32>up(0, be:0), down(0, be:0), mgmt:0, xon(req:0, rsp:0), bcast:0, wnd(0, parent:00:00:00:00:00:00)
I (66731663) [mwifi, 258]: State represents: 0
I (66731683) mesh: [RXQ]<max:32 = cfg:32 + extra:0>self:0, <max:32 = cfg:32 + extra:0>tods:6
I (66731673) [EVENT_LOOP, 47]: MWIFI_PARENT_CONNECTED [mac=07FC, channel=6, type=1, layer=1, rssi=-31, ssid=TP-Link_07FC]
I (66731713) [EVENT_LOOP, 143]: MWIFI_TODS_STATE [connected=false]
I (66731713) [EVENT_LOOP, 149]: MWIFI_ROOT_ADDRESS [mac=A9C5]
JL1946 commented 3 years ago

Have there been any developments with this issue? I am having similar issues in that I am using another ESP32-WROOM Development board as the AP. When it goes off line for a period of time and then comes back on line, the Root indicates that it has reconnected to the AP, but fails to create a socket with the AP server... I get Error 113 (No Route to Host)...

zhanzhaocheng commented 3 years ago

esp-idf v4.3.1 fixes related problems, but it is not exactly the same as this problem. It is recommended that you update esp-idf under ESP-MDF to v4.3.1 to verify whether the problem is resolved