Open KonssnoK opened 4 months ago
@Espressif-liuuuu @zhangyanjiaoesp @nishanth-radja Do you have any finding in above log?
@KonssnoK Are you using the ip_internal_network
example? In your logs, it show the ping timeout, but it can't confirm the wifi connection is down. Can you capture packets for this ?
@KonssnoK Are you using the
ip_internal_network
example? In your logs, it show the ping timeout, but it can't confirm the wifi connection is down. Can you capture packets for this ?
@zhangyanjiaoesp yes the base is the ip_internal_network example. how would you check if the wifi connection is down? apart from seeing no packets are sent/receive.
Also, how would you get the packets ? Wireshark connected to a sniffer?
So @zhangyanjiaoesp i was able to generate one strange behavior, even if it's not exacly the one reported in this issue.
With the same code (v4.4 top of c0e0af03d153d2c157d1d420831ab33d48888768 )
you can apply patches 1 2 3, which enable monitoring and pinging
03_ip_internal.patch 02_if_dumps.patch 01_packets_dump.patch
by randomly detaching/attaching the layer 2 device i was able to reach this state, where the L2 device is never able to communicate with L1 timeout_l2.txt
I got an extract of L1 too (i would say MESH_EVENT_CHILD_CONNECTED to track L2 events)
interestingly enough to recover the L2 device i had to reboot both devices, meaning rebooting only the L2 device was not solving the issue, and even rebooting the L1 device while L2 device was stuck (after reboot) did not solve the issue
@zhangyanjiaoesp again by simply resetting the 2 devices in different ways, i was able to trigger another case in which one device does not work anymore until reboot. to be noted: once this device is failing, rebooting the other device makes it fail too.
timeout_reset1.txt timeout_reset2.txt
to recover the devices i had to keep them offline enough for the phone to lose the cache of connected devices ( pixel8 )
@KonssnoK please provide your sdkconfig file, and you are using PSRAM, right?
sdkconfig.txt @zhangyanjiaoesp here it is
So @zhangyanjiaoesp i was able to generate one strange behavior, even if it's not exacly the one reported in this issue.
With the same code (v4.4 top of c0e0af0 )
you can apply patches 1 2 3, which enable monitoring and pinging
03_ip_internal.patch 02_if_dumps.patch 01_packets_dump.patch
by randomly detaching/attaching the layer 2 device i was able to reach this state, where the L2 device is never able to communicate with L1 timeout_l2.txt
I got an extract of L1 too (i would say MESH_EVENT_CHILD_CONNECTED to track L2 events)
This log show the device didn't get the IP address, which cause the ping timeout.
@zhangyanjiaoesp again by simply resetting the 2 devices in different ways, i was able to trigger another case in which one device does not work anymore until reboot. to be noted: once this device is failing, rebooting the other device makes it fail too. timeout_reset1.txt timeout_reset2.txt to recover the devices i had to keep them offline enough for the phone to lose the cache of connected devices ( pixel8 )
And this log show the device can't connect to the router, the reason is auth timeout.
I have tested using the router, and can't reproduce this issue. I will use the mobile hostspot to test again, can you provide the model of your phone? Or any phone can reproduce this issue? @KonssnoK
@zhangyanjiaoesp i reproduced with a Google Pixel8.
not getting the IP - strange, would mean the IP service is stuck 🤔
@zhangyanjiaoesp i moved to 3 devices and trying to replicate but for now without success..
and as soon as i wrote that, something strange happened again: dev3 is not able to connect
dev2.txt disconnected_dev3.txt dev1.txt
I (42466) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1
W (43196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:0
W (44396) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1
W (45596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:2
W (46796) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:3
W (47996) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:4
W (49196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5
(devices are 30cm apart fom each other)
it seems it goes on forever
W (1983596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1617
one hour in:
W (6444006) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5334
W (13189216) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:10955
this is instead the log of device1 getting stuck and not trying to connect to the mesh anymore
and as soon as i wrote that, something strange happened again: dev3 is not able to connect
dev2.txt disconnected_dev3.txt dev1.txt
I (42466) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1 W (43196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:0 W (44396) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1 W (45596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:2 W (46796) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:3 W (47996) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:4 W (49196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5
(devices are 30cm apart fom each other)
it seems it goes on forever
W (1983596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1617
@KonssnoK In the log, I see that initially communication among the three devices was normal, and then you restarted device2 and device3?
And in the logs of device2 and device3, there are logs showing I (42466) wifi:state: run -> init (2c0)
, this means the wifi connection is disconnected. The wifi disconnection will cause the W (6444006) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5334
. But I can't find why the wifi disconnect. Can you use wireshark to capture packets of the device1/2/3, and send the logs and captures to me? In the log, please display the absolute time.
And I have using the Google Pixel5 mobile to test, but I didn't reproduce the problem.
and as soon as i wrote that, something strange happened again: dev3 is not able to connect dev2.txt disconnected_dev3.txt dev1.txt
I (42466) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1 W (43196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:0 W (44396) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1 W (45596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:2 W (46796) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:3 W (47996) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:4 W (49196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5
(devices are 30cm apart fom each other) it seems it goes on forever
W (1983596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1617
@KonssnoK In the log, I see that initially communication among the three devices was normal, and then you restarted device2 and device3? And in the logs of device2 and device3, there are logs showing
I (42466) wifi:state: run -> init (2c0)
, this means the wifi connection is disconnected. The wifi disconnection will cause theW (6444006) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5334
. But I can't find why the wifi disconnect. Can you use wireshark to capture packets of the device1/2/3, and send the logs and captures to me? In the log, please display the absolute time.And I have using the Google Pixel5 mobile to test, but I didn't reproduce the problem.
@zhangyanjiaoesp today i have no way to use wireshark. the only way to recover that device was to reboot it again. Yes, to trigger the issue i simply randomly restarted devices. i'm not sure on how to display absolute time considering the logs are in close sourced files. I understand the wifi is disconnected, but i would expect it to retry a connection once disconnected 🤔
Yes, to trigger the issue i simply randomly restarted devices
Ok, I will try to restart the device and test again.
@zhangyanjiaoesp in my experience slow data rates help achieving the issues. Please put your phone cell technology to 2G or go to "developer options" networking "network download rate limit" and put the minimum.
please note that the logs are more or less synchronized at the end, not the start! (i extract them more or less at the same time)
240619dev3.txt 240619dev1.txt 240619dev2.txt
@zhangyanjiaoesp i rebooted the root device and it went offline without managing to reconnect.
After a while device 3 managed to change status and directly connect as the root. the other 2 remain disconnected
240619dev3_2.txt 240619dev1_2.txt 240619dev2_2.txt
device 2 at some point manages to recover too.
240619dev3_3.txt 240619dev1_3.txt 240619dev2_3.txt
device one is still disconnected and not able to recover instead.
phone connected in 5G with rate limiter at 128kbps
device 1 dodes not recover
@zhangyanjiaoesp for reference this setup seems to trigger the issue in the above message quite often. once again the root device is stuck after a reboot, device 2 takes the root in this occasion, device 3 follows 2, but device 1 is stuck.
@zhangyanjiaoesp i tried also today to replicate, to verify if this is consistent:
it's quite easy to create issues in this configuration, please let me know if you manage.
240620dev3.txt 240620dev1.txt 240620dev2.txt
after a while dev 3 recovers and then also dev 2. dev 1 is stuck.
@KonssnoK I'm sorry, I have an urgent task recently. I will test your issue next week.
@zhangyanjiaoesp sure, i'll concentrate on another issue meanwhile
and as soon as i wrote that, something strange happened again: dev3 is not able to connect
dev2.txt disconnected_dev3.txt dev1.txt
I (42466) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1 W (43196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:0 W (44396) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1 W (45596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:2 W (46796) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:3 W (47996) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:4 W (49196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5
(devices are 30cm apart fom each other)
it seems it goes on forever
W (1983596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1617
@KonssnoK I have reproduced this issue by rebooting the root device, and I have found the root cause, the following wifi libs can solve the problem. Please replace the wifi libs and test again. wifi_lib_s3_0625.zip (wifi firmware version: f736b07)
For the other issues, I still can't reproduce them although I randomly reboot the device2/3.
@KonssnoK I have reproduced this issue by rebooting the root device, and I have found the root cause
What's the root cause?
and as soon as i wrote that, something strange happened again: dev3 is not able to connect dev2.txt disconnected_dev3.txt dev1.txt
I (42466) wifi:new:<1,0>, old:<1,1>, ap:<1,1>, sta:<1,1>, prof:1 W (43196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:0 W (44396) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1 W (45596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:2 W (46796) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:3 W (47996) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:4 W (49196) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:5
(devices are 30cm apart fom each other) it seems it goes on forever
W (1983596) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:1, no_wnd_count:43, timeout_count:1617
@KonssnoK I have reproduced this issue by rebooting the root device, and I have found the root cause, the following wifi libs can solve the problem. Please replace the wifi libs and test again. wifi_lib_s3_0625.zip (wifi firmware version: f736b07)
For the other issues, I still can't reproduce them although I randomly reboot the device2/3.
@zhangyanjiaoesp sure I will try them. Should this unblock devices from avoid reconnection? Because we are currently seeing this kind of issues a bit everywhere on the field. Thanks
@zhangyanjiaoesp i changed the library but i see no difference in the behavior of the children 🤔 240625dev1_1.txt 240625dev3_1.txt 240625dev2_1.txt
EDIT: i think this might have been related to the fact that the 4th device, which was acting as root, was not updated yet with latest libraries. I will now retry in the nominal configuration
@zhangyanjiaoesp i changed the library but i see no difference in the behavior of the children 🤔 240625dev1_1.txt 240625dev3_1.txt 240625dev2_1.txt
EDIT: i think this might have been related to the fact that the 4th device, which was acting as root, was not updated yet with latest libraries. I will now retry in the nominal configuration
It's wired. Here is device2 log on my side when the root rebooting. device2.txt
@zhangyanjiaoesp here is dev1 blocked again
240625dev3_2.txt 240625dev2_2.txt 240625dev1_2.txt
W (423934) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:103, no_wnd_count:0, timeout_count:209
So @zhangyanjiaoesp i was able to generate one strange behavior, even if it's not exacly the one reported in this issue.
With the same code (v4.4 top of c0e0af0 )
you can apply patches 1 2 3, which enable monitoring and pinging
03_ip_internal.patch 02_if_dumps.patch 01_packets_dump.patch
@KonssnoK My test is based on the ip_internal_network
example and added the above patch. And I noticed this print W (106224) mesh_hand: Triggering DYNAMIC MESH handover
in your logs. You used a different test code, right?
So @zhangyanjiaoesp i was able to generate one strange behavior, even if it's not exacly the one reported in this issue. With the same code (v4.4 top of c0e0af0 ) you can apply patches 1 2 3, which enable monitoring and pinging 03_ip_internal.patch 02_if_dumps.patch 01_packets_dump.patch
@KonssnoK My test is based on the
ip_internal_network
example and added the above patch. And I noticed this printW (106224) mesh_hand: Triggering DYNAMIC MESH handover
in your logs. You used a different test code, right?
yes sorry it's an evolution of your code. I will put back the old one and reproduce again
dev1 is now blocked in a
I (135890) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
loop
same issue triggered also on dev3
240625dev1_4.txt 240625dev3_4.txt 240625dev2_4.txt
(rate limiter always on)
with the same procedure another issue appeared:
another case of dev1 getting stuck after reset. the reset was hold for some seconds before releasing (i don't always just press/release)
@zhangyanjiaoesp last example of this issue:
240625dev3_7.txt 240625dev1_7.txt 240625dev2_7.txt
now i'll go back to the other one while you fix this
dev1 is now blocked in a
I (135890) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
loop
When the root device is connecting to the router, and the disconnect reason is 2 (auth expire), then the root will continue to reconnect to the router. So you see the I (135890) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
log loop. User should handle this case in the application layer.
By the way, I (50960) wifi:state: auth -> init (200)
this log indicates the device sends auth request, but the router doesn't reply auth response. It's strange why the router doesn't reply auth response. Could it have something to do with the hotspot used? Do you have this problem if you use another hotspot or router?
dev1 is now blocked in a
I (135890) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
loop 240625dev2_3.txt 240625dev1_3.txt 240625dev3_3.txtWhen the root device is connecting to the router, and the disconnect reason is 2 (auth expire), then the root will continue to reconnect to the router. So you see the
I (135890) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
log loop. User should handle this case in the application layer. By the way,I (50960) wifi:state: auth -> init (200)
this log indicates the device sends auth request, but the router doesn't reply auth response. It's strange why the router doesn't reply auth response. Could it have something to do with the hotspot used? Do you have this problem if you use another hotspot or router?
how should the user handle this error? and why is it reported continuously alternated to the error 205?
I will try with a samsung device
with the same procedure another issue appeared:
- generation of 2 networks that do not merge back together
- dev1 and dev3 are both ROOT and do not try to merge together
By default, it allows more than one root existing in one mesh network. Please call esp_mesh_allow_root_conflicts(false)
to disable it.
how should the user handle this error?
For example, if the auth fails because the router moved its position, user can call esp_mesh_waive_root()
to change a better root.
and why is it reported continuously alternated to the error 205?
When connects fail, sta will add this AP to a blacklist, and reason code 205 indicates scan fail due to the AP is in blacklist, after this the AP will be removed from the blacklist. So you will see the reason code is 2/205 loop.
For example, if the auth fails because the router moved its position, user can call esp_mesh_waive_root() to change a better root.
well my router is always the same and not moving 🤔 so it is a bit strange
Is it the STA blacklisting the AP or or the AP blacklisting the STA? because i would expect the router to blacklist a device that constantly disconnect.
Also i expected the device to check if another device became root, since the rest of the network manages to recover
something like this for the waiving?
case MESH_EVENT_PARENT_DISCONNECTED: {
mesh_event_disconnected_t *disconnected = (mesh_event_disconnected_t *)event_data;
ESP_LOGI(MESH_TAG,
"<MESH_EVENT_PARENT_DISCONNECTED>reason:%d",
disconnected->reason);
mesh_layer = esp_mesh_get_layer();
mesh_netifs_stop();
if (esp_mesh_is_root() && disconnected->reason == WIFI_REASON_CONNECTION_FAIL){
esp_mesh_waive_root();
}
}
with the same procedure another issue appeared:
- generation of 2 networks that do not merge back together
- dev1 and dev3 are both ROOT and do not try to merge together
By default, it allows more than one root existing in one mesh network. Please call
esp_mesh_allow_root_conflicts(false)
to disable it.
@zhangyanjiaoesp should this be called on all devices or only on root? when should it be called ? before mesh_start?
with the same procedure another issue appeared:
- generation of 2 networks that do not merge back together
- dev1 and dev3 are both ROOT and do not try to merge together
By default, it allows more than one root existing in one mesh network. Please call
esp_mesh_allow_root_conflicts(false)
to disable it.@zhangyanjiaoesp should this be called on all devices or only on root? when should it be called ? before mesh_start?
Call it before mesh start on all devices.
Is it the STA blacklisting the AP or or the AP blacklisting the STA?
The STA add the AP to STA's blacklist.
something like this for the waiving?
case MESH_EVENT_PARENT_DISCONNECTED: { mesh_event_disconnected_t *disconnected = (mesh_event_disconnected_t *)event_data; ESP_LOGI(MESH_TAG, "<MESH_EVENT_PARENT_DISCONNECTED>reason:%d", disconnected->reason); mesh_layer = esp_mesh_get_layer(); mesh_netifs_stop(); if (esp_mesh_is_root() && disconnected->reason == WIFI_REASON_CONNECTION_FAIL){ esp_mesh_waive_root(); } }
yes, maybe add a check for the number of failures?
@zhangyanjiaoesp calling esp_mesh_waive_root
always returns error
ESP_ERR_MESH_DISCARD
and the device remains stuck.
Patch for latest code 04_waive_root.patch
Example of partial log:
I (455320) mesh: [wifi]disconnected reason:2(auth expire), continuous:239/max:12, root, vote(,stopped)<><>
W (456320) ping: From 8.8.8.8 icmp_seq=114 timeout
E (457320) ping_sock: send error=0
I (457990) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:205
I (457990) mesh: [wifi]disconnected reason:205(), continuous:240/max:12, root, vote(,stopped)<><>
W (457990) mesh_main: esp_mesh_waive_root 16405
I (458080) wifi:new:<11,2>, old:<11,0>, ap:<11,2>, sta:<11,0>, prof:11
I (458080) wifi:state: init -> auth (b0)
I (459080) wifi:state: auth -> init (200)
I (459090) wifi:new:<11,0>, old:<11,2>, ap:<11,2>, sta:<11,0>, prof:11
I (459090) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
I (459090) mesh: [wifi]disconnected reason:2(auth expire), continuous:241/max:12, root, vote(,stopped)<><>
W (460320) ping: From 8.8.8.8 icmp_seq=115 timeout
E (461320) ping_sock: send error=0
I (461760) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:205
I (461760) mesh: [wifi]disconnected reason:205(), continuous:242/max:12, root, vote(,stopped)<><>
I (461800) wifi:new:<11,2>, old:<11,0>, ap:<11,2>, sta:<11,0>, prof:11
I (461800) wifi:state: init -> auth (b0)
I (462810) wifi:state: auth -> init (200)
I (462810) wifi:new:<11,0>, old:<11,2>, ap:<11,2>, sta:<11,0>, prof:11
I (462810) mesh_main: <MESH_EVENT_PARENT_DISCONNECTED>reason:2
I (462810) mesh: [wifi]disconnected reason:2(auth expire), continuous:243/max:12, root, vote(,stopped)<><>
W (464320) ping: From 8.8.8.8 icmp_seq=116 timeout
E (465320) ping_sock: send error=0
@KonssnoK
ESP_ERR_MESH_DISCARD
this error indicates that the softAP doesn't have children. In your auth failure scenario, I think we should first confirm why the hotspot does not reply auth response. Have you tested on another hotspot or router?
@KonssnoK
ESP_ERR_MESH_DISCARD
this error indicates that the softAP doesn't have children. In your auth failure scenario, I think we should first confirm why the hotspot does not reply auth response. Have you tested on another hotspot or router?
No I updated the code and retried with the changes. I will go back to testing with the Samsung phone. In any case, it should work any router/phone 🤔, A different behavior in the AP should be handled in any case correctly by the STA
with samsung phone:
W (435697) mesh: [mesh_schedule.c,3131] [WND-RX]max_wnd:2, 1200 ms timeout, seqno:0, xseqno:26, no_wnd_count:0, timeout_count:244
240627dev1_1.txt
240627dev3_1.txt
240627dev2_1.txt
@zhangyanjiaoesp the failed auth does not happen on the samsung phone but dev2 apparently gets stuck as a child.
240627dev3_2.txt 240627dev1_2.txt 240627dev2_2.txt
but on the Google phone what should be done when the root gets into a 2 205 error loop? Of course it doesn't have children because the rest of the network reshape by itself. What should be done to make it recover and search for other devices?
Answers checklist.
General issue report
@zhangyanjiaoesp
based on 27ec26d2d3f44bbde5da14c7fdfc82226d567874 To reproduce
Connect to mobile hostspot wifi
Put phone simcard to 2G.
Wait
The issue happens in the ROOT. Once the root is affected, the issue is propagated to all children.
The issue is either temporary or permanent. When permanent, the only way to recover is to reboot.
espressif_wifi_dump_3.txt espressif_wifi_dump.txt espressif_wifi_dump_2.txt
Other related issues with similar behavior: https://github.com/espressif/esp-idf/issues/8953 https://github.com/espressif/esp-idf/issues/10506