espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.42k stars 7.37k forks source link

DHCP connection problem with some router models after since v2.0.0 #8069

Open DigiH opened 1 year ago

DigiH commented 1 year ago

Board

ESP32 Dev module and LilyGo TTGO LoRa32 V2.1

Device Description

Routers: Archer C7 and Huawei, Fastgate DN8245f2

Hardware Configuration

N/A

Version

v2.0.7

IDE Name

PlatformIO

Operating System

macOS and Windows, various versions

Flash frequency

40MHz

PSRAM enabled

no

Upload speed

115200

Description

Having experienced router DHCP problems myself with recent switches of esp32_platform from 3.3.1 though 5.2.0 to now 6.1.0, with the latest related included arduinoespressif32 3.20007.0 in OpenMQTTGateway, the very latest version is also causing major connection issues for some users, as reported in

https://community.openmqttgateway.com/t/esp32-unable-to-reconnect-to-wifi-after-reboot-with-huawei-fastgate-dn8245f2/2406

I started testing with the WiFiClientConnect example, with the following findings for my Archer C7 problems.

espressif32@6.1.0 - arduinoespressif32 3.20007.0 serial log

[WiFi] Connecting to TestNet
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.21
[WiFi] Disconnecting from WiFi!
[WiFi] Disconnected from WiFi!

(debug below) and the related router log image

which eventually does connect for me, but only if there are no other devices trying to connect at the same time or renewing their DHCP lease, which is set to 24 hours. For users with the Fastgate DN8245f2 router however this seems to be causing complete connection fails with their LilyGo gateways after a reconnection attempt.

Using exactly the same WiFiClientConnect sketch with

espressif32@3.5.0 - arduinoespressif32 3.10006.0 the result is

[WiFi] Connecting to TestNet
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.21
[WiFi] Disconnecting from WiFi!
[WiFi] Disconnected from WiFi!

(debug below) with a nice quick and clean image on the router side.

While this really only seems to be an issue with certain routers models, the only current stopgap measure was to revert to the espressif32@3.5.0 platform with a test version for affected users. Not really ideal.

Any help and pointers to narrow down where the underlying problem might lie, and/or how to possibly provide more detailed information to address this is greatly appreciated.

Thanks

Sketch

[WiFiClientConnect.ino](https://github.com/espressif/arduino-esp32/blob/master/libraries/WiFi/examples/WiFiClientConnect/WiFiClientConnect.ino)

Debug Message

**espressif32@6.1.0 - arduinoespressif32 3.20007.0**
[WiFi] Connecting to TestNet
[    72][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 0 - WIFI_READY
[   181][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[   179][V][WiFiGeneric.cpp:340] _arduino_event_cb(): STA Started
[   246][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 2 - STA_START
[WiFi] WiFi is disconnected[   380][V][WiFiGeneric.cpp:355] _arduino_event_cb(): STA Connected: SSID: TestNet, BSSID: aa:bb:cc:dd:ee:ff, Channel: 1, Auth: WPA2_PSK

[   500][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[  6203][V][WiFiGeneric.cpp:369] _arduino_event_cb(): STA Got New IP:192.168.0.21
[  6204][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  6243][D][WiFiGeneric.cpp:996] _eventCallback(): STA IP: 192.168.0.21, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.21
[WiFi] Disconnecting from WiFi!
[  6628][V][WiFiGeneric.cpp:362] _arduino_event_cb(): STA Disconnected: SSID: TestNet, BSSID: aa:bb:cc:dd:ee:ff, Reason: 8
[WiFi] Disconnected from WiFi!
[  6716][V][WiFiGeneric.cpp:343] _arduino_event_cb(): STA Stopped
[  6714][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 5 - STA_DISCONNECTED
[  6909][W][WiFiGeneric.cpp:955] _eventCallback(): Reason: 8 - ASSOC_LEAVE
[  6988][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 3 - STA_STOP
[WiFi] Disconnecting from WiFi!

**espressif32@3.5.0 - arduinoespressif32 3.10006.0**
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 0 - WIFI_READY
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 2 - STA_START
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 7 - STA_GOT_IP
[D][WiFiGeneric.cpp:419] _eventCallback(): STA IP: 192.168.0.21, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.21
[WiFi] Disconnecting from WiFi!
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 3 - STA_STOP
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 3 - STA_STOP
[WiFi] Disconnected from WiFi!
[WiFi] Disconnecting from WiFi!

Other Steps to Reproduce

No response

I have checked existing issues, online documentation and the Troubleshooting Guide

mrengineer7777 commented 1 year ago

WiFi issue. Likely related to changes in underlying IDF. Worth investigating but may need that router to reproduce. Won't be solved quickly.

SuGlider commented 1 year ago

@DigiH - I remember reading about some issues related to DHCP using older IDF version. Something related to that was fixed in IDF4.4.2.

I'd like to ask you, if possible, to build your project using Arduino IDE instead of PIO and use the latest Arduino Core 2.0.7. In case you can do it, please post the execution log.

DigiH commented 1 year ago

Thanks for the tip @SuGlider, but unless I am mistaken, with Platformio and using espressif32@6.1.0, the esp-idf version is

"framework-espidf": {
      "type": "framework",
      "optional": true,
      "owner": "platformio",
      "version": "~3.50001.0",
      "optionalVersions": ["~3.40404.0"]
    },

so v5.0.1 with optionalVersions 4.4.4. Whereas the, for me and a few others, correctly working espressif32@3.5.0 has

“framework-espidf”: {
      “type”: “framework”,
      “optional”: true,
      “owner”: “platformio”,
      “version”: “~3.40302.0",
      “optionalVersions”: [“~3.40001.0"]
    },

I already tested with this older version in espressif32@6.1.0, but the issue does remains the same, very apparent with the initial logging of

[    74][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 0 - WIFI_READY
[   188][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[   186][V][WiFiGeneric.cpp:340] _arduino_event_cb(): STA Started
[   253][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 2 - STA_START
…

I will still try the WiFiClientConnect example in the ArduinoIDE as you suggested, but for a final addressing of the problem for OpenMQTTGateway Platformio really would be required.

DigiH commented 1 year ago

I'd like to ask you, if possible, to build your project using Arduino IDE instead of PIO and use the latest Arduino Core 2.0.7. In case you can do it, please post the execution log.

@SuGlider - pretty much the same results with the Arduino IDE. Tested with latest Arduino IDE 2.0.4, Arduino IDE 2.0.5 nightly and also older Arduino 1.8.13

With latest Arduino Core 2.0.7

[    66][D][WiFiGeneric.cppN⸮⸮_eventCallback(): Arduino Event: 0 - WIFI_READY
[   171][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[   169][V][WiFiGeneric.cpp:340] _arduino_event_cb(): STA Started
[   235][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 2 - STA_START
[WiFi] WiFi is disconnected
[   367][V][WiFiGeneric.cpp:355] _arduino_event_cb(): STA Connected: SSID: TestNet, BSSID: aa:bb:cc:dd:ee:ff, Channel: 1, Auth: WPA2_PSK
[   491][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[  6523][V][WiFiGeneric.cpp:369] _arduino_event_cb(): STA Got New IP:192.168.0.51
[  6523][D][WiFiGeneric.cpp:931] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  6562][D][WiFiGeneric.cpp:996] _eventCallback(): STA IP: 192.168.0.51, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.51

going back step by step to 2.0.0

⸮⸮⸮⸮⸮⸮⸮⸮): Arduino Event: 0 - WIFI_READY
[   157][V][WiFiGeneric.cpp:272] _arduino_event_cb(): STA Started
[   160][D][WiFiGeneric.cpp:808] _eventCallback(): Arduino Event: 2 - STA_START
[   234][V][WiFiGeneric.cpp:96] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[  2861][V][WiFiGeneric.cpp:284] _arduino_event_cb(): STA Connected: SSID: TestNet, BSSID: aa:bb:cc:dd:ee:ff, Channel: 1, Auth: WPA2_PSK
[  2868][D][WiFiGeneric.cpp:808] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[  8940][V][WiFiGeneric.cpp:294] _arduino_event_cb(): STA Got New IP:192.168.0.51
[  8940][D][WiFiGeneric.cpp:808] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  8979][D][WiFiGeneric.cpp:857] _eventCallback(): STA IP: 192.168.0.51, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.51

and only when switching back to 1.0.6

[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 0 - WIFI_READY
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 2 - STA_START
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[WiFi] WiFi is disconnected
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:374] _eventCallback(): Event: 7 - STA_GOT_IP
[D][WiFiGeneric.cpp:419] _eventCallback(): STA IP: 192.168.0.51, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.51

ADDENDUM: I have now also tested this with other router models (of my neighbours), and there always is an initial error message during the DHCP IP assignment on each model, and only does connect eventually, after a few pings and pongs as my TP-Link Archer C7 logs above show. An immediate DHCP connection/IP assignment is only ever possible with 1.0.6. For the users with the, Internet Provider assigned, Huawei Fastgate DN8245f2 router, this is also the only version which actually makes a DHCP connection for them at all.

SuGlider commented 1 year ago

@DigiH - In your last testing with Arduino IDE, it seems that DHCP worked fine. In other words, the ESP32 STA got an IP from the DHCP server.

In Arduino Core 2.0.x we use IDF 4.x In Arduino Core 1.0.x we use IDF 3.x

LwIP may have changed in IDF from the version 3.x to 4.x, thus there are "more steps", but it does get an IP at the end of the process.

Do you still see a problem with DHCP?

DigiH commented 1 year ago

In your last testing with Arduino IDE, it seems that DHCP worked fine. In other words, the ESP32 STA got an IP from the DHCP server.

@SuGlider Fine isn't quite the word I would use ;) Yes, the test case got an IP eventually even with the many DHCP to and fros, but this was only because of the ideal staged test cases.

If any second, or heaven forbid, several more ESP32s would try to get an IP at the same time, this DHCP ping pong would go on for many minutes or forever, sometimes to the point that none of the devices would get an IP, the DHCP router logs showing this continuous trial requests without any success of any and eventually messing up the router's DHCP server to the point that other devices trying the renew their DHCP lease would also get completely cut off from the network - this was the situation which unfortunately originally introduced me to this problem.

First I thought it was some issue with my router, so I set it up again from scratch, but the issue came back, so more investigation was in order, which finally brought me to the issue here. The only way to solve such occurrences is to manually one by one disconnect all ESP32s from power to give the DHCP server the chance to settle down and normalise again, then only reconnect the ESP32s one by one, with enough time in between to avoid the same (apologies for the term, but it really feels like) DHCP DOS attack again.

LwIP may have changed in IDF from the version 3.x to 4.x, thus there are "more steps", but it does get an IP at the end of the process.

I am not in the position to say how and which part is causing the problem, but the "more steps" is definitely causing issues, which, in ideal staged conditions as with the tests above, do get an IP eventually, but in real world implementation more than not do cause major DHCP and IP connection issues when there is more than just one device around.

Just out of curiosity and with pure ignorance about the underlying changes, are these 'additional' steps of any advantage in other areas, or why is it not possible to have the same one REQUEST, one ACKNOWLEDGEMENT DHCP success from version 2.0.0 onwards, as it was with 1.0.6 and before?

Do you still see a problem with DHCP?

Very much so, as for me the only workable solution at the moment is to exclusively use the older arduino-esp32 version 1.0.6, to avoid DHCP network break-downs again, of which I experienced enough to make this not so easy decision, as obviously using a later or latest version would be preferable for other reasons, board compatibility for one.

And let's not forget the Huawei Fastgate DN8245f2 users, for whom there never was any DHCP success with 2.0.7, but 1.0.6 is the only working solution.

Thanks

Jason2866 commented 1 year ago

@DigiH I had strange issues in a very specific use case with Tasmota here too. Reverted the used lwip in IDF 4.4 to an older commit and compiled a Arduino platformio framework with. Solved my issue. You can try the build with:

platform                    = https://github.com/tasmota/platform-espressif32/releases/download/2023.02.00/platform-espressif32.zip
platform_packages           = framework-arduinoespressif32 @ https://github.com/Jason2866/esp32-arduino-lib-builder/releases/download/1312/framework-arduinoespressif32-lwip_timeout-ed6742e7f0.zip

If that solves, we have a simply sketch and and env setup where it is reproducible. So we can open an Issue in IDF github. My case is to complicated and random to open an issue. Just can say an older lwip version does solve the issue for that.

DigiH commented 1 year ago

Hi @Jason2866,

Thanks a lot for your reply, but I have searched, read and tested your suggested solution with the custom framework in the other thread already before posting the issue here :)

Unfortunately it didn't resolved or change the problems others and I are seeing with the DHCP connection problem.

The multi DISCOVER, OFFER, REQUEST, ACKNOWLEDGEMENT to and fros seem to be happening with all routers I can get my hand on for testing, comopared to the working 1.0.6 version, but only seems to cause connection issues or total failures with some. If it is only affected by different routers' DHCP implementation/speed, the amount of devices trying to get connected and an IP at the same time, or also, to an extend, by the amount of static DHCP assigned IPs, I do not know.

Jason2866 commented 1 year ago

Maybe newest build... It is based on latest IDF 4.4 (branch release/v4.4). Updated wifi libs (closed source) and newer lwip.

platform_packages  =  https://github.com/Jason2866/esp32-arduino-lib-builder/releases/download/1313/framework-arduinoespressif32-release_v4.4-c6ee118fa2.zip
DigiH commented 1 year ago

Thanks a lot @Jason2866 for your suggestions and newer test builds, but the results are staying the same for me, with the constant PingPong "Will I connect and if so, if there are no others trying to connect at the same time, then when" with the simple WiFiClientConnect example (results below), as well as full latest OpenMQTTGateway dev build.

[   181][V][WiFiGeneric.cpp:97] set_esp_interface_ip(): Configuring Station static IP: 0.0.0.0, MASK: 0.0.0.0, GW: 0.0.0.0
[WiFi] WiFi is disconnected
[   917][V][WiFiGeneric.cpp:355] _arduino_event_cb(): STA Connected: SSID: TestNet, BSSID: aa:bb:cc:dd:ee:ff, Channel: 1, Auth: WPA2_PSK
[   926][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 4 - STA_CONNECTED
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[WiFi] WiFi Status: 0
[  6957][V][WiFiGeneric.cpp:369] _arduino_event_cb(): STA Got New IP:0
[  6958][D][WiFiGeneric.cpp:1035] _eventCallback(): Arduino Event: 7 - STA_GOT_IP
[  6998][D][WiFiGeneric.cpp:1098] _eventCallback(): STA IP: 192.168.0.51, MASK: 255.255.255.0, GW: 192.168.0.1
[WiFi] WiFi is connected!
[WiFi] IP address: 192.168.0.51
Screenshot 2023-04-21 at 16 32 23

So if and until this might get addressed the only option for myself and some others is to stick with version 1.0.6 😞

Jason2866 commented 1 year ago

Since you have a reproducible issue, you could write a IDF example and open an issue in IDF github.

DigiH commented 1 year ago

I would have thought that the simple straightforward supplied WiFiClientConnect example would have been the best to show the issue, but will look into your suggestion, thanks.

DigiH commented 1 year ago

Newly reported router which has major problems as reported above, again only solved with a build pre-v2.0.0

ASUS ZenWiFi AX (XT8)

FatherMarco1971 commented 2 months ago

Have same problem here...finally found this post i'll chek soon if reverting to older version solve the problem.