earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
2.03k stars 421 forks source link

Intermittent failure to connect to wifi #2031

Closed obdevel closed 7 months ago

obdevel commented 7 months ago

This may be more of a discussion item so please feel free to move.

I have one wifi router that seems to cache previous connections and this causes problems with frequent restarts of the Pico, as happens during development. Sometimes it takes several minutes to reconnect; sometimes a hard power cycle of the Pico solves it; sometimes rebooting the router is the only solution.

Leaving the Pico powered off for some time gives an instant connection. I presume the router's cache has timed out.

This program uses both cores but the same problem occurs on single-core programs.

I'm using the latest arduino-pico core release.

The router is close by; RSSI is about -40.

My iPhone and MacBook handle this just fine and the Pico works fine with another router.

I have seen discussions elsewhere that imply the connection attempt needs to wait longer before timing out. The router's DHCP server is possibly pinging the IP address before giving it out again, which takes time, causing the Pico's connection to fail.

Sample output from my program (I initiate the connection in setup() and retry at 10 sec intervals in loop()):

       ...
       806: > device configured wifi mode = 1/STA
       806: > connecting to wifi as STA
       ...
      1672: > setup core 0 complete
       ...
      4633: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     11670: > ** reconnecting to wifi after failed attempt timeout
     12532: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     15497: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     22533: > ** reconnecting to wifi after failed attempt timeout
     23392: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     26361: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     33392: > ** reconnecting to wifi after failed attempt timeout
     34252: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     37222: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     44252: > ** reconnecting to wifi after failed attempt timeout
     45112: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     48078: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     55113: > ** reconnecting to wifi after failed attempt timeout
     55972: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     58941: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     65973: > ** reconnecting to wifi after failed attempt timeout
     66832: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     69800: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     76833: > ** reconnecting to wifi after failed attempt timeout
     77695: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     80671: > >>> wifi status change from 6/WL_DISCONNECTED to 4/WL_CONNECT_FAILED
     87696: > ** reconnecting to wifi after failed attempt timeout
     88556: > >>> wifi status change from 4/WL_CONNECT_FAILED to 6/WL_DISCONNECTED
     94498: > >>> wifi status change from 6/WL_DISCONNECTED to 3/WL_CONNECTED
     94498: > wifi is now connected
     94498: > IP address: 192.168.0.153
       ...
     94579: > network startup complete
earlephilhower commented 7 months ago

The problem here is unfortunately the same as on the ESP8266 when WiFi got weird...we have no visibility into the binary blob that drives the WiFi chip. The driver in the SDK just passes commands back and forth to the chip running its own OS and executable driving the WiFi connection process (and all other WiFi stuff). So there's nothing we can do here about it.

As there are 2 layers here which could be in error, maybe it would be worthwhile to see if, even when the WiFi connection fails if the link layer is connected and only the DHCP request did not get responded to? (i.e. if your router can say what radio clients it has separately from its dhcp leases)...

earlephilhower commented 7 months ago

Another way would be to set a static IP address. So then, if the radios link up then it will connect. This would point to something wrong with the DHCP side and not the WiFi link layer.

obdevel commented 7 months ago

Thanks. I will experiment with a static IP address. I will also try to setup Wireshark and do some snooping.

obdevel commented 7 months ago

It seems this router is pathologically dumb. It's a portable MiFi type I got for free with a 4G data SIM.

It's been a useful learning exercise and will be helpful if I need to troubleshoot others using my code.

Thanks for your support. I'll close this issue and reopen if required.