espressif / arduino-esp32

Arduino core for the ESP32
GNU Lesser General Public License v2.1
13.02k stars 7.3k forks source link

WiFi Auto Reconnect Still Not Working - Is there a best practice to guarantee connection? #653

Closed vseven closed 5 years ago

vseven commented 6 years ago

Board: ModeMCU ESP32 Dev Module Core Installation/update date: 15/Sep/2017 IDE name: Arduino IDE

It seems there is no auto reconnect logic allowing the ESP32 to reconnect when the connection drops or if there is its not working properly. I have had completely random results with my ESP32 board, sometimes its up for as little as 30 minutes and at most 18 - 20 hours. WiFi is dropping and doesn't want to reconnect.

The library I am using for WiFi is watching the WiFi event and on a disconnect its trying to reconnect which helps but sometimes it fails the reconnect and then that's it...I have to reboot. There is no sleep/low power mode at all involved and the device is constantly powered by 3.3v in. There was a case opened regarding this, https://github.com/espressif/arduino-esp32/issues/353, but its closed without any real answer (says auto connect is kind of implemented...not sure what that means).

I guess the question comes down to is auto reconnect implemented in the code itself, reliably, and if not what is the best Arduino code way of making sure it is connected?

-Allan

me-no-dev commented 6 years ago

can you please enable debug in the board menu and see why it does not want to reconnect when you tell it to do so?

vseven commented 6 years ago

I can try that tonight. The Arduino code is pretty basic, watching the WiFi event and then has this:

case SYSTEM_EVENT_STA_DISCONNECTED:
    Serial.println("WiFi lost connection.  Attempting to reconnect...");
    WiFi.reconnect();
    break;

But shouldn't there be something that can be called to make this happen on it's own?

nepeee commented 6 years ago

Just for a quick fix: Start a timer in the SYSTEM_EVENT_STA_DISCONNECTED that calls the WiFi.reconnect(); function and stop it in the SYSTEM_EVENT_STA_CONNECTED. The main problem is with your idea is if the device cant connect than the SYSTEM_EVENT_STA_DISCONNECTED event is not called so its never try to reconnect again.

vseven commented 6 years ago

Unfortunately I'm adapting someone else library and I'm not very familiar with C++ but I'll see what I can do about adding a timer (I know VB.net...I can at least figure out the logical parts and hopefully adapt).

But I think you are right about it not reconnecting or better yet only trying on a disconnect which then if it fails just once it give up. Enabling debug might show me that so I'll start there.

But back to the original question and the previously closed issue: Is this functionality not built-in? And if not why was the previous issue closed as "kind of" working? Also if not will it be?

vseven commented 6 years ago

I switched the debug level to debug and opened serial monitor and got this over and over:

[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...

I checked the last time it sent data to another controller and that was over 24 hours ago so I'm assuming its stuck in this mode. So I powered it down and back up, it connected to WiFi just fine and transmitted data how it should for about 2 hours this time. Then without any other message it disconnected and started with the exact same message over and over. I let it go for a hour, rebooted the controller, and its connected again but I'm sure its just a matter of time before it starts over.

So the first question is how do I fix it. Second this still leads me to my previous questions about auto reconnect

me-no-dev commented 6 years ago

Guys please enable debug so you can see more verbose WiFi state output: screen shot 2017-09-22 at 13 48 16

everslick commented 6 years ago

I use the following code to check if my device is still connected to the STA (call this from loop()). You get the idea? wifi_is_connected is a global that gets set and reset in the wifi event callback.

static void poll_watchdog_sta(void) {
  static uint32_t ms = millis();
  static uint8_t watchdog = 0;

  // watch out for STA disconnects and reboot if
  // connection cannot be reestablished after
  // X minutes

  // this watchdog MUST NOT reboot the device if
  // wifi is not enabled or never was connected
  // since last reboot

  if (wifi_is_enabled && watchdog_enabled && ((millis() - ms) > 1000 * 6)) {
    ms = millis();

    if (!wifi_is_connected) {
      if (++watchdog < watchdog_timeout * 10) { // timeout in minutes ( * 10 )
        if (watchdog == 1) {
          log_print(F("WIFI: arming network watchdog (reboot in %i min.)"),
            watchdog_timeout
          );
        }
      } else {
        log_print(F("WIFI: still not connected, triggering reboot ..."));
        system_reboot();
      }
    } else {
      if (watchdog) {
        log_print(F("WIFI: network is back, disarming watchdog"));
        watchdog = 0;
      }
    }
  }
}
vseven commented 6 years ago

Yeah, I don't see a Loop() in the library I'm using for WiFi (again, third party) but I understand what you are doing and can run with it.

Going to turn on the verbose logging and see if I get any more info first before changing anything.

Also the WiFi Events that are being checked and acted on are SYSTEM_EVENT_STA_GOT_IP, SYSTEM_EVENT_STA_DISCONNECTED, SYSTEM_EVENT_STA_START, and SYSTEM_EVENT_STA_CONNECTED. There are more however at https://github.com/espressif/esp-idf/blob/master/components/esp32/include/esp_event.h so I added a default: to the switch that should print out the event to see if the library is getting something but just not acting on it.

I saw in the library I'm using a //WiFi.setAutoReconnect(true); so I'm assuming the original author tried to use auto reconnect but it wasn't working so commented it out?

vseven commented 6 years ago

@me-no-dev - Verbose gave a much better explanation. Well...maybe to you:

Everything was fine then got this over and over, maybe 150 times:

[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113

Followed by this:

[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 7 - NOT_ASSOCED
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 202 - ASSOC_FAIL
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5

The the last part (reconnect, event 5, reason 2) just repeats every 4 - 5 seconds.

vseven commented 6 years ago

Based on that first error that repeats over and over I found this: https://github.com/espressif/arduino-esp32/issues/180. In it you said "Try flushing the client or reading all available data. ". How would I go about doing that? I'm looking at the library I'm using and this is what they have to catch the WiFi Event:

//**************************************************************************************
/// Event Handler for ESP32 WiFi Events (needed to implement reconnect logic for now...)
//**************************************************************************************
    void MyESP32WiFi::WiFiEvent(WiFiEvent_t event)
    {
        Serial.printf("[WiFi-event] event: %d\n", event);

        switch (event) {
        case SYSTEM_EVENT_STA_GOT_IP:
            Serial.println("WiFi connected");
            Serial.println("IP address: ");
            Serial.println(WiFi.localIP());
            break;
        case SYSTEM_EVENT_STA_DISCONNECTED:
            Serial.println("WiFi lost connection.  Attempting to reconnect...");
            WiFi.reconnect();
            break;
        case SYSTEM_EVENT_STA_START:
            Serial.println("ESP32 station start");
            break;
        case SYSTEM_EVENT_STA_CONNECTED:
            Serial.println("ESP32 station connected to AP");
            break;
        default:            
            Serial.println("Unhandled WiFi Event raised.");
            break;
        }

    }

Based on the errors in the previous post I'm getting the "5" which is translating to the WiFi connection lost and then the code is trying to execute "WiFi.reconnect();" but its obviously not working. Should I have something else there? Maybe a counter kinda like @everslick example but simpler (just a i = i + 1) and if it end up on the same place after 20 tries it reboots? Or should this not happen/be prevented in the first place?

me-no-dev commented 6 years ago

Try calling WiFi.begin() instead of reconnect and report back :)

vseven commented 6 years ago

@me-no-dev - I changed it to WiFi.begin() instead of WiFi.reconnect() but it didn't help. Worked for about a hour then got the lwip_connect_r:113 line, each one about a minute or so apart followed by the disconnect every 5 or so seconds:

[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 7 - NOT_ASSOCED
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE

And then the 4 lines just repeat indefinitely until the control is rebooted. What else can I try? The fact @everslick is rebooting the controller when its in the same disconnected state is worrisome....I would think rebooting is the last option and something can be done to keep it online.

As a side note two ESP8266's connected to the same router (Netgear R8000) have no issues and have been stable for over a month without disconnects.

me-no-dev commented 6 years ago

hmmm.... this seems like an issue either in the lower stack or in your router. you can see the first reason being NOT_ASSOCED and then you get AUTH_EXPIRE. could you maybe turn off the STA and then turn it back on? The disconnect cause makes me think that something in the re-assoc request is either not changed or your router does not accept, so it refuses you to connect back. I'll see if I can raise an internal issue about this. What is the router firmware version that you are using?

vseven commented 6 years ago

Its a Netgear Nighthawk R8000. Currently on the latest firmware V1.0.3.54_1.1.37 although it did it on the last version also. I have 20 - 25 devices connected at any given time without issues including the previously mentioned two ESP8266's.

I will attempt to replace the Wi-Fi begin/reconnect line with the STA off and on and see what happens.

Edit: Knowing not much about this and using someone elses library....how do I "turn off the STA and then turn it back on"?

Second Edit: It couldn't be related to https://github.com/espressif/esp-idf/issues/499#issuecomment-314262611 could it? One of the only things I've seen with the same NOT_ASSOCED and AUTH_EXPIRE. I wouldn't think so only because I compiled myself without the BLE stuff but never know.

me-no-dev commented 6 years ago

I have also alerted the wifi team :) we will see if anything comes out of this

vseven commented 6 years ago

Thanks. Should I be trying to do the STA off and on? If so can you guide me how or point me in the right direction?

I also saw this: https://github.com/espressif/esp-idf/issues/738#issuecomment-311626685 which indicated that I can't call the WiFi.connect or WiFi.Begin from the event handler? Could that also be my issue? @rojer ?

vseven commented 6 years ago

So I bought a second ESP32, same one, to try and make sure it isn't a hardware issue. I recompiled the firmware yesterday night off a fresh pull from github (https://nodemcu.readthedocs.io/en/dev-esp32/en/build/) and flashed both the new chip and the old one. I also flashed both with the exact same sketch. The original is in my garage, a little ways from my router, and the other test one is in my living room where the WiFi is a lot stronger. Both have serial monitor running on a connected PC with verbose set. I'll report back anything I find.

With the above said is there anything else I can try? How do I do the STA off and on? Can that be done in the STA_Disconnect part where the WiFi.reconnect is?

vseven commented 6 years ago

Well that didn't last long....I'm having the same issue with the new one. First the original one started with the same "[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113" error and not sending data after barely 30 minutes online. Usually got that error two at a time with about 5 or 6 seconds between then then 45 - 50 seconds between the next set but again kinda random. Did that for around 10 minutes then went into the same disconnect pattern:


[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
Data: Sending: temperature1 -49.02
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
Data: Sending: humidity1 -5.02
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 7 - NOT_ASSOCED
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 202 - ASSOC_FAIL
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5

And repeated the last 4 lines over and over. At one point it did do something different...the error message changed slightly:

[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 113
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 118
Data: Sending: temperature2 -1.00
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 118
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 118
Data: Sending: humidity2 -1.00
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 118
[E][WiFiClient.cpp:97] connect(): lwip_connect_r: 118
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5

Notice the error switched to ending in 118 as opposed to 113. But from that point on it just repeated the 118 error randomly along with the STA_DISCONNECTED and AUTH_EXPIRED over and over.

Then the new ESP32 started doing the exact same thing. mostly 113 error then the same pattern of NOT_ASSOCED, ASSOC_FAIL, and AUTH_EXPIRE followed by just the disconnect and AUTH_EXPIRE over and over.

They failed at different times, 25 minutes apart, while they were started at the exact same time. WiFi signal doesn't seem to be a factor nor does any certain timing.

Edit: Just to be clear on a reboot it connects instantly on the first try every time:

rst:0x1 (POWERON_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
flash read err, 1000
ets_main.c 371 
ets Jun  8 2016 00:22:57

rst:0x10 (RTCWDT_RTC_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0010,len:4
load:0x3fff0014,len:708
load:0x40078000,len:0
load:0x40078000,len:11460
entry 0x400789f4
Everything: init started
Everything: Free RAM = -1
Disabling ESP32 WiFi Access Point
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 2 - STA_START
[WiFi-event] event: 2
Initializing ESP32 WiFi network.  Please be patient...
ESP32 station start
Attempting to connect to WPA SSID: (SSIDRemoved)
..[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 4 - STA_CONNECTED
[WiFi-event] event: 4
ESP32 station connected to AP
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 7 - STA_GOT_IP
[WiFi-event] event: 7
WiFi connected
IP address: 
192.168.xx.xx
me-no-dev commented 6 years ago

Still waiting on response from the WiFi team. Will make some more noise as this is annoying to me as well. Issue is somewhere way below Arduino and in the closed source wifi lib.

vseven commented 6 years ago

Thanks. I'm going to attempt to add a counter to the STA_DISCONNECTED event switch statement that I keep getting and after 5 try to re-init the wifi and after 10 reboot the controller and on a connect reset the counter to 0, see if that at least keeps the controller up.

If there is anything else I can provide you to help get this fixed or code to try please let me know. If it matters the ESP32 I'm using is being used for two voltage inputs, one binary input, and a DTH22 (temp + humidity) then that data is being sent to a controller through HTTP.

liuzfesp commented 6 years ago

Hi @vseven, I just can't reproduce this issue myself, could you let me know the firmware version of WiFi? You can get if from the startup log, it looks like "I (588) wifi: wifi firmware version: 2cd69aa", then I can provide a debug version wifi lib to you to debug this issue.

liuzfesp commented 6 years ago

@vseven could you let me know the WiFi reconnect logic? I means when to call esp_wifi_connect()? It will be helpful if you can paste the implementation logic.

me-no-dev commented 6 years ago

@liuzfesp wifi logging is disabled in Arduino. @vseven comment this line to get it to show

vseven commented 6 years ago

Ok, I commented out that line and recompiled. I'll report back. As a side note I tried to add a counter to reconnect by recalling my init() but it didn't do anything....it was like the init() was never called. I might be doing it wrong though. But when the counter got to 10 the ESP.restart defiantly was called and rebooted correctly which afterward it reconnected:

    void SmartThingsESP32WiFi::WiFiEvent(WiFiEvent_t event)
    {
        Serial.printf("[WiFi-event] event: %d\n", event);
        switch (event) {
        case SYSTEM_EVENT_STA_GOT_IP:
            Serial.println("WiFi connected");
            Serial.println("IP address: ");
            Serial.println(WiFi.localIP());
            break;
        case SYSTEM_EVENT_STA_DISCONNECTED:
            Serial.println("WiFi lost connection.  Attempting to reconnect...");
            WiFi.reconnect();
            disconnectCounter++;
            if (disconnectCounter > 5) {
                Serial.println("We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...");
                void init();
            }
            if (disconnectCounter > 10) {
                Serial.println("We have recieved the STA_DISCONNECTED event 10 times now.  Reboot...");
                ESP.restart();
            }
            break;
        case SYSTEM_EVENT_STA_START:
            Serial.println("ESP32 station start");
            break;
        case SYSTEM_EVENT_STA_CONNECTED:
            Serial.println("ESP32 station connected to AP");
            disconnectCounter = 0;
            break;
        }
    }

@liuzfesp - here is the extra info with that line commented out:

I (243) wifi: wifi firmware version: c1b8a2f
I (243) wifi: config NVS flash: enabled
I (243) wifi: config nano formating: disabled
I (257) wifi: Init dynamic tx buffer num: 32
I (257) wifi: Init data frame dynamic rx buffer num: 64
I (257) wifi: Init management frame dynamic rx buffer num: 64
I (261) wifi: wifi driver task: 3ffd0d8c, prio:23, stack:4096
I (267) wifi: Init static rx buffer num: 10
I (270) wifi: Init dynamic rx buffer num: 0
I (274) wifi: Init rx ampdu len mblock:7
I (278) wifi: Init lldesc rx ampdu entry mblock:4
I (282) wifi: wifi power manager task: 0x3ffd60f0 prio: 21 stack: 2560
I (290) wifi: wifi timer task: 3ffd7148, prio:22, stack:3584
I (314) wifi: mode : null
Disabling ESP32 WiFi Access Point

I (815) wifi: Init Ampdu: 1 tx baw=6 rx baw=6
I (815) wifi: mode : sta (30:ae:a4:25:78:64)
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 2 - STA_START

Initializing ESP32 WiFi network.  Please be patient...
[WiFi-event] event: 2
ESP32 station start
Attempting to connect to WPA SSID: (SSIDREMOVED)
.I (1946) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
.I (2603) wifi: state: init -> auth (b0)
I (2605) wifi: state: auth -> assoc (0)
I (2612) wifi: state: assoc -> run (10)
I (2670) wifi: connected with (SSIDREMOVED), channel 1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 4 - STA_CONNECTED
[WiFi-event] event: 4
ESP32 station connected to AP
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 7 - STA_GOT_IP
[WiFi-event] event: 7
WiFi connected

And the files I'm using (before I added the disconnect counter):

SmartThingsESP32WiFi.h.txt SmartThingsESP32WiFi.cpp.txt

vseven commented 6 years ago

@liuzfesp - So it did the same thing with the debug enabled, hopefully something in here will tell you something. It ran fine for a hour or so then started throwing the 113 error randomly. After about 15 minutes of random 113 errors in between the actual data sending routines it did this:

I (5427111) wifi: state: run -> auth (7c0)
I (5427111) wifi: pm stop, total sleep time: 0/1119534452

I (5427111) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 7 - NOT_ASSOCED
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (5427133) wifi: state: auth -> init (0)
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 202 - ASSOC_FAIL
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (5427276) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5427276) wifi: state: init -> auth (b0)
I (5428276) wifi: state: auth -> init (2)
I (5428277) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (5428413) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5428414) wifi: state: init -> auth (b0)
I (5429414) wifi: state: auth -> init (2)
I (5429414) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (5429551) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5429551) wifi: state: init -> auth (b0)
I (5430551) wifi: state: auth -> init (2)
I (5430551) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (5430688) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5430688) wifi: state: init -> auth (b0)
I (5431689) wifi: state: auth -> init (2)
I (5431689) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
I (5431826) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5431826) wifi: state: init -> auth (b0)
I (5432826) wifi: state: auth -> init (2)
I (5432826) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
I (5432963) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5432963) wifi: state: init -> auth (b0)
I (5433964) wifi: state: auth -> init (2)
I (5433964) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
I (5434100) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5434101) wifi: state: init -> auth (b0)
I (5435101) wifi: state: auth -> init (2)
I (5435101) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
I (5435238) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5435238) wifi: state: init -> auth (b0)
I (5436238) wifi: state: auth -> init (2)
I (5436239) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
I (5436375) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (5436376) wifi: state: init -> auth (b0)
I (5437376) wifi: state: auth -> init (2)
I (5437376) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[D][WiFiGeneric.cpp:182] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:187] _eventCallback(): Reason: 2 - AUTH_EXPIRE
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event 5 times now.  Re-init...
We have recieved the STA_DISCONNECTED event 10 times now.  Reboot...
I (5437405) wifi: flush txq
I (5437407) wifi: stop sw txq
I (5437410) wifi: lmac stop hw txq
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)

So for each STA_DISCONNECT it tried calling WiFi.reconnect with no apparent success then it tried running init() after more then 5 disconnects which also didn't help then once it hit 10 disconnects it rebooted which did reboot and reconnect.

vseven commented 6 years ago

I kept serial monitor running on my ESP32 all weekend and it rebooted about 10 times between 6pm Friday night and 9pm Sunday night. Each time it was exactly like the above logs.

@me-no-dev - can you guide me in a way to just turn off and back on the STA to see if that also works for reconnecting and I can try that after 5 failed attempts? The reboot after 10 works but I plan on using this to control RGB lighting and I really don't want to have lights turning off and on.

@liuzfesp - Is there anything else I can provide you with for debugging? Or anything that stands out I can try?

vseven commented 6 years ago

Any updates? Still have constant disconnects with both devices.

me-no-dev commented 6 years ago

@liuzfesp any news here?

vseven commented 6 years ago

I see there are some code changes coming with bluetooth and WiFi. Not sure if any of that will help me or not. I still continue to have the same issues and the reboot is working to keep my data being passed but I haven't been able to use it for RGB light strip control since the lights kept turning off on a reboot.

@liuzfesp - Is there anything in the logs that helps or anything else I can provide or try?

vseven commented 6 years ago

I cleaned up some things, made sure in my code there was nothing doing delays or blocking, and I now have less errors (don't see the 113 and 118 errors) but still having the same disconnect issue and it seems directly related to the WiFi trying to authenticate (renew?) and failing. Here is a fresh capture from this weekend where I had it reboot about 5 times in a 24 hour period. Again, same thing on two different ESP32s, I look for the disconnect (event 5) and try to reconnect with a WiFi.reconnect and a WiFi.begin both with no luck so after 10 tries it reboots and connects fine afterward:

I (1903652) wifi: active cnt: 5
I (1913652) wifi: active cnt: 5
I (1923652) wifi: send null to keep active
I (1933652) wifi: send null to keep active
I (1943653) wifi: active cnt: 1
I (1953653) wifi: active cnt: 2
I (1963653) wifi: send null to keep active
I (1973653) wifi: send null to keep active
I (1983653) wifi: send null to keep active
I (1993653) wifi: active cnt: 1
I (2003654) wifi: send null to keep active
I (2013654) wifi: send null to keep active
I (2023654) wifi: send null to keep active
I (2033654) wifi: send null to keep active
I (2043654) wifi: send null to keep active
I (2053654) wifi: send null to keep active
I (2063654) wifi: send null to keep active
I (2073655) wifi: send null to keep active
I (2083655) wifi: active cnt: 1
I (2093655) wifi: send null to keep active
I (2103655) wifi: send null to keep active
I (2113655) wifi: send null to keep active
I (2123655) wifi: send null to keep active
I (2133655) wifi: send null to keep active
I (2143656) wifi: send null to keep active
I (2153656) wifi: send null to keep active
I (2163656) wifi: send null to keep active
I (2173656) wifi: send null to keep active
I (2183656) wifi: send null to keep active
I (2193656) wifi: send null to keep active
I (2203657) wifi: send null to keep active
I (2213657) wifi: send null to keep active
I (2223657) wifi: send null to keep active
I (2233657) wifi: send null to keep active
I (2243657) wifi: send null to keep active
I (2253657) wifi: send null to keep active
I (2263657) wifi: send null to keep active
I (2273658) wifi: active cnt: 2
I (2283658) wifi: send null to keep active
I (2293658) wifi: send null to keep active
I (2303658) wifi: send null to keep active
I (2313658) wifi: send null to keep active
I (2323658) wifi: send null to keep active
I (2333658) wifi: send null to keep active
I (2343659) wifi: send null to keep active
I (2353659) wifi: send null to keep active
I (2363659) wifi: send null to keep active
I (2373659) wifi: send null to keep active
I (2383659) wifi: send null to keep active
I (2393659) wifi: send null to keep active
I (2403660) wifi: send null to keep active
I (2413660) wifi: send null to keep active
I (2423660) wifi: send null to keep active
I (2433660) wifi: send null to keep active
I (2443660) wifi: send null to keep active
I (2453660) wifi: send null to keep active
I (2463660) wifi: send null to keep active
I (2473661) wifi: send null to keep active
I (2483661) wifi: send null to keep active
I (2483663) wifi: state: run -> auth (7c0)
I (2483663) wifi: pm stop, total sleep time: 0/-1824912959

I (2483663) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2483674) wifi: state: auth -> init (0)
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2483805) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2483805) wifi: state: init -> auth (b0)
I (2484805) wifi: state: auth -> init (2)
I (2484806) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2484930) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2484931) wifi: state: init -> auth (b0)
I (2485931) wifi: state: auth -> init (2)
I (2485931) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2486056) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2486056) wifi: state: init -> auth (b0)
I (2487056) wifi: state: auth -> init (2)
I (2487056) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2487181) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2487181) wifi: state: init -> auth (b0)
I (2488182) wifi: state: auth -> init (2)
I (2488182) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2488306) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2488307) wifi: state: init -> auth (b0)
I (2489307) wifi: state: auth -> init (2)
I (2489307) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2489432) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2489432) wifi: state: init -> auth (b0)
I (2490432) wifi: state: auth -> init (2)
I (2490432) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2490557) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2490557) wifi: state: init -> auth (b0)
I (2491558) wifi: state: auth -> init (2)
I (2491558) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2491682) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2491683) wifi: state: init -> auth (b0)
I (2492683) wifi: state: auth -> init (2)
I (2492683) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
I (2492808) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
I (2492808) wifi: state: init -> auth (b0)
I (2493808) wifi: state: auth -> init (2)
I (2493808) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
[WiFi-event] event: 5
WiFi lost connection.  Attempting to reconnect...
We have recieved the STA_DISCONNECTED event over 10 times now.  Reboot...
I (2493819) wifi: flush txq
I (2493822) wifi: stop sw txq
I (2493824) wifi: lmac stop hw txq
ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:1
load:0x3fff0010,len:4
load:0x3fff0014,len:708
load:0x40078000,len:0
load:0x40078000,len:11460
entry 0x400789f4
Everything: init started
Everything: Free RAM = -1
I (749) wifi: wifi firmware version: c1b8a2f
I (749) wifi: config NVS flash: enabled
I (749) wifi: config nano formating: disabled
I (758) wifi: Init dynamic tx buffer num: 32
I (758) wifi: Init data frame dynamic rx buffer num: 64
I (758) wifi: Init management frame dynamic rx buffer num: 64
I (762) wifi: wifi driver task: 3ffca34c, prio:23, stack:4096
I (768) wifi: Init static rx buffer num: 10
I (771) wifi: Init dynamic rx buffer num: 0
I (775) wifi: Init rx ampdu len mblock:7
I (779) wifi: Init lldesc rx ampdu entry mblock:4
I (783) wifi: wifi power manager task: 0x3ffd6004 prio: 21 stack: 2560
I (791) wifi: wifi timer task: 3ffd705c, prio:22, stack:3584
I (815) wifi: mode : null
Disabling ESP32 WiFi Access Point

I (1816) wifi: Init Ampdu: 1 tx baw=6 rx baw=6
I (1816) wifi: mode : sta (30:ae:a4:07:f1:98)
[WiFi-event] event: 2

Initializing ESP32 WiFi network.  Please be patient...
ESP32 station start
Attempting to connect to WPA SSID: (SSIDREMOVED)
.I (2943) wifi: n:1 0, o:1 0, ap:255 255, sta:1 0, prof:1
.I (3600) wifi: state: init -> auth (b0)
I (3602) wifi: state: auth -> assoc (0)
I (3606) wifi: state: assoc -> run (10)
I (3635) wifi: connected with (SSIDREMOVED), channel 1
[WiFi-event] event: 4
ESP32 station connected to AP
[WiFi-event] event: 7
WiFi connected
IP address: 
192.168.1.142

@liuzfesp / @me-no-dev - Is there any other information I can provide? Is this happening to anyone else or only with my two HiLetGo ModeMCU ESP32s? Is there any custom firmware that I can try?

-Allan

Markusenz commented 6 years ago

I have the exact same behaviour on my WROOM-32 module. Every 3rd or 4th reboot, it hangs in the STA_DISCONNECTED -> AUTH_EXPIRED loop. Checked a second module of same type, showing same behaviour. My Router is a TL-WR841 (TPLINK) but happens as well with a ASUS RT-AC87U.

When will there be a solution, its useless if there is need of a reboot in such situations...

edit: for me, this problem also occurs after boot, i.e. WiFi.begin() will not establish a connection at all but loops in STA_DISCONNECTED -> AUTH_EXPIRED

vseven commented 6 years ago

I haven't gotten a response in a month although, no offense, I'm happy it's happening to someone else so I know I'm not alone.

@liuzfesp / @me-no-dev - Has there been any progress on this or anything we can try to get this to stay connected reliably to WiFi.

me-no-dev commented 6 years ago

poked @liuzfesp again... this is as much as I can really do. I do not have access to the WiFi lib source

copercini commented 6 years ago

@igrr

me-no-dev commented 6 years ago

@copercini why?

igrr commented 6 years ago

Since i was tagged here... @vseven, is there any chance you could perform packet capture using Wireshark and a WiFi adapter in monitor mode? Of particular interest is the time when reconnect process starts failing. Since we don't know what is happening here, and we can't reproduce similar issue in our CI environment, the packet capture might reveal some anomaly, giving a clue about the situation you are facing.

Another thought (just to reduce the number of variables), does this issue also happen if you connect to an open WiFi network?

Markusenz commented 6 years ago

currently, all reconnects fail with my WROOM32s with following debug output:

[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START ....[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED ..[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP WiFi connected ... then I block antenna reception or press RESET at my AP, to force a disconnect ... [D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED [W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 200 - NO_AP_FOUND ... when reception should be ok again, it takes about 2 minutes until it says:

[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 8 - STA_WPS_ER_SUCCESS

However it remains disconnected from network forever in that state. WPS is turned off at my AP. Do not know what the debug message means. Same behaviour with two different APs and two different WROOM32 modules, using the wificlient example sketch. No matter if WPA2 used or open network. Can send Wireshark log in monitor mode if needed.

sidoh commented 6 years ago

First of all, thanks everyone for your hard work on the SDK. I've had a ton of fun developing with it, and I'm very grateful for all of the effort that's gone into it.

I'm seeing this issue too. I think I may have managed to narrow it down a little bit. I'm using this really small sketch that should work on both ESP8266 and ESP32:

https://github.com/sidoh/esp32_reconnect_demo

I turned off auto reconnect so that I had more control of the variables.

Setup

  1. Testing on both ESP8266 and ESP32
  2. Using a mobile hotspot network. Tried both WPA2 PSK and open networks
  3. Allow MCU to connect to network and run for a few seconds. I added an NTP client to verify connectivity.

Then I tried two different things:

(A) Tear down network, immediately recreate

Under these circumstances, the network is usually recreated by the time the MCU notices it's disconnected.

(B) Tear down network, wait for MCU to notice it's disconnected, recreate network

Observations

  1. Behavior is the same with both WPA2 and open networks.
  2. The ESP8266 successfully reconnects with both (A) and (B).
  3. ESP32 reconnects in experiment (A), but gets stuck in (B).
  4. ESP32 does successfully reconnect in both (A) and (B) if: WiFi.disconnect(true); and WiFi.begin(WIFI_SSID, WIFI_PASSWD); are called. Seems like both are necessary.
    \ The boolean wifioff parameter to disconnect being set to true also seems necessary. Since WiFi.disconnect(true); appears to clear WiFi settings, it's necessary to re-specify the ssid and password in the call to begin.

Logs

All of these are with an open network. I'm omitting the ESP8266 logs since the behavior was as expected, but can provide if they'd be useful.

ESP32, experiment (A)

I (25) wifi: wifi firmware version: 708a055
I (26) wifi: config NVS flash: enabled
I (26) wifi: config nano formating: disabled
I (31) wifi: Init dynamic tx buffer num: 32
I (32) wifi: Init data frame dynamic rx buffer num: 64
I (32) wifi: Init management frame dynamic rx buffer num: 64
I (35) wifi: wifi driver task: 3ffd46dc, prio:23, stack:4096
I (40) wifi: Init static rx buffer num: 10
I (44) wifi: Init dynamic rx buffer num: 0
I (48) wifi: wifi power manager task: 0x3ffd9418 prio: 21 stack: 2560
I (425) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
I (2836) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (2836) wifi: state: init -> auth (b0)
I (2853) wifi: state: auth -> assoc (0)
I (2872) wifi: state: assoc -> run (10)
I (2872) wifi: connected with glowyrectangle, channel 6
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Setup completed
[4] ssid=glowyrectangle
[5] ssid=glowyrectangle
I (5872) wifi: pm start, type:0

[6] ssid=glowyrectangle
[7] ssid=glowyrectangle
[8] ssid=glowyrectangle
[9] ssid=glowyrectangle
[10] ssid=glowyrectangle
I (10388) wifi: state: run -> auth (7c0)
I (10389) wifi: pm stop, total sleep time: 0/4516399

I (10389) wifi: n:6 0, o:6 0, ap:255 255, sta:6 0, prof:1
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 7 - NOT_ASSOCED
W (11413) wifi: Haven't to connect to a suitable AP now!
[11] ssid=
Attempting to reconnect...
I (13822) wifi: n:6 0, o:6 0, ap:255 255, sta:6 0, prof:1
I (13823) wifi: state: auth -> auth (b0)
I (13826) wifi: state: auth -> assoc (0)
I (13839) wifi: state: assoc -> run (10)
I (13839) wifi: connected with glowyrectangle, channel 6
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Connection result: 3
[1513219385] ssid=glowyrectangle
[1513219386] ssid=glowyrectangle
[1513219387] ssid=glowyrectangle
[1513219388] ssid=glowyrectangle
I (16839) wifi: pm start, type:0

ESP32, experiment (B)

rst:0x10 (RTCWDT_RTC_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:812
load:0x40078000,len:0
load:0x40078000,len:11404
entry 0x40078aa0
I (25) wifi: wifi firmware version: 708a055
I (26) wifi: config NVS flash: enabled
I (26) wifi: config nano formating: disabled
I (35) wifi: Init dynamic tx buffer num: 32
I (36) wifi: Init data frame dynamic rx buffer num: 64
I (36) wifi: Init management frame dynamic rx buffer num: 64
I (39) wifi: wifi driver task: 3ffd46d0, prio:23, stack:4096
I (44) wifi: Init static rx buffer num: 10
I (48) wifi: Init dynamic rx buffer num: 0
I (52) wifi: wifi power manager task: 0x3ffd940c prio: 21 stack: 2560
I (431) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
I (2842) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (2843) wifi: state: init -> auth (b0)
I (2846) wifi: state: auth -> assoc (0)
I (2863) wifi: state: assoc -> run (10)
I (2863) wifi: connected with glowyrectangle, channel 6
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Setup completed
[3] ssid=glowyrectangle
[4] ssid=glowyrectangle
I (5863) wifi: pm start, type:0

[5] ssid=glowyrectangle
[1513219573] ssid=glowyrectangle
[1513219574] ssid=glowyrectangle
[1513219575] ssid=glowyrectangle
[1513219576] ssid=glowyrectangle
[1513219577] ssid=glowyrectangle
[1513219578] ssid=glowyrectangle
[1513219579] ssid=glowyrectangle
I (13174) wifi: bcn_timout,ap_probe_send_start
[1513219580] ssid=glowyrectangle
[1513219581] ssid=glowyrectangle
[1513219582] ssid=glowyrectangle
I (15676) wifi: ap_probe_send over, resett wifi status to disassoc
I (15677) wifi: state: run -> init (1)
I (15678) wifi: pm stop, total sleep time: 0/9813809

I (15678) wifi: n:6 0, o:6 0, ap:255 255, sta:6 0, prof:1
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 200 - NO_AP_FOUND
W (16387) wifi: Haven't to connect to a suitable AP now!
[1513219583] ssid=
Attempting to reconnect...
Connection result: 5
[E][WiFiUdp.cpp:183] endPacket(): could not send data: 118
W (17412) wifi: Haven't to connect to a suitable AP now!
[1513219584] ssid=
Attempting to reconnect...
Connection result: 5
[E][WiFiUdp.cpp:183] endPacket(): could not send data: 118
W (18437) wifi: Haven't to connect to a suitable AP now!
[1513219585] ssid=
Attempting to reconnect...
Connection result: 5
[E][WiFiUdp.cpp:183] endPacket(): could not send data: 118
W (19462) wifi: Haven't to connect to a suitable AP now!
[1513219586] ssid=
Attempting to reconnect...
Connection result: 5

... ( loops like this indefinitely ) ...

ESP32, experiment (B) with disconnect(true) + begin(ssid, passwd)

rst:0x10 (RTCWDT_RTC_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0018,len:4
load:0x3fff001c,len:812
load:0x40078000,len:0
load:0x40078000,len:11404
entry 0x40078aa0
I (27) wifi: wifi firmware version: 708a055
I (28) wifi: config NVS flash: enabled
I (28) wifi: config nano formating: disabled
I (36) wifi: Init dynamic tx buffer num: 32
I (36) wifi: Init data frame dynamic rx buffer num: 64
I (36) wifi: Init management frame dynamic rx buffer num: 64
I (39) wifi: wifi driver task: 3ffd46d0, prio:23, stack:4096
I (45) wifi: Init static rx buffer num: 10
I (48) wifi: Init dynamic rx buffer num: 0
I (52) wifi: wifi power manager task: 0x3ffd940c prio: 21 stack: 2560
I (431) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
I (2842) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (2843) wifi: state: init -> auth (b0)
I (2847) wifi: state: auth -> assoc (0)
I (2863) wifi: state: assoc -> run (10)
I (2863) wifi: connected with glowyrectangle, channel 6
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Setup completed
[1513219802] ssid=glowyrectangle
[1513219803] ssid=glowyrectangle
[1513219804] ssid=glowyrectangle
I (5863) wifi: pm start, type:0

[1513219805] ssid=glowyrectangle
[1513219806] ssid=glowyrectangle
[1513219807] ssid=glowyrectangle
[1513219808] ssid=glowyrectangle
[1513219809] ssid=glowyrectangle
[1513219810] ssid=glowyrectangle
[1513219811] ssid=glowyrectangle
[1513219812] ssid=glowyrectangle
[1513219813] ssid=glowyrectangle
[1513219814] ssid=glowyrectangle
[1513219815] ssid=glowyrectangle
[1513219816] ssid=glowyrectangle
I (17282) wifi: bcn_timout,ap_probe_send_start
[1513219817] ssid=glowyrectangle
[1513219818] ssid=glowyrectangle
[1513219819] ssid=glowyrectangle
I (19785) wifi: ap_probe_send over, resett wifi status to disassoc
I (19785) wifi: state: run -> init (1)
I (19786) wifi: pm stop, total sleep time: 0/13922305

I (19786) wifi: n:6 0, o:6 0, ap:255 255, sta:6 0, prof:1
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 200 - NO_AP_FOUND
W (20616) wifi: Haven't to connect to a suitable AP now!
[1513219820] ssid=
Attempting to reconnect...
I (20632) wifi: mode : null
I (20634) wifi: flush txq
I (20634) wifi: stop sw txq
I (20635) wifi: lmac stop hw txq
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 3 - STA_STOP
I (20644) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 201 - AUTH_FAIL
Connection result: 1
[E][WiFiUdp.cpp:183] endPacket(): could not send data: 118
W (24182) wifi: Haven't to connect to a suitable AP now!
[1513219823] ssid=
Attempting to reconnect...
I (24254) wifi: mode : null
I (24256) wifi: flush txq
I (24257) wifi: stop sw txq
I (24257) wifi: lmac stop hw txq
I (24266) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 3 - STA_STOP
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 5 - STA_DISCONNECTED
[W][WiFiGeneric.cpp:270] _eventCallback(): Reason: 201 - AUTH_FAIL
Connection result: 1
[E][WiFiUdp.cpp:183] endPacket(): could not send data: 118
W (27804) wifi: Haven't to connect to a suitable AP now!
[1513219827] ssid=
Attempting to reconnect...
I (27820) wifi: mode : null
I (27822) wifi: flush txq
I (27822) wifi: stop sw txq
I (27822) wifi: lmac stop hw txq
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 3 - STA_STOP
I (27832) wifi: mode : sta (30:ae:a4:04:42:c8)
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 2 - STA_START
I (30311) wifi: n:6 0, o:1 0, ap:255 255, sta:6 0, prof:1
I (30312) wifi: state: init -> auth (b0)
I (30318) wifi: state: auth -> assoc (0)
I (30339) wifi: state: assoc -> run (10)
I (30340) wifi: connected with glowyrectangle, channel 6
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 4 - STA_CONNECTED
[D][WiFiGeneric.cpp:265] _eventCallback(): Event: 7 - STA_GOT_IP
Connection result: 3
[1513219831] ssid=glowyrectangle
[1513219832] ssid=glowyrectangle
I (33340) wifi: pm start, type:0

[1513219833] ssid=glowyrectangle
[1513219834] ssid=glowyrectangle
[1513219835] ssid=glowyrectangle
[1513219835] ssid=glowyrectangle
[1513219836] ssid=glowyrectangle

Wrapup

I've done some digging around in the SDK, but don't see anything obvious. I may dig around a bit, but I hope the above information is useful. If there's any other info I can provide please let me know; I'd be more than happy to oblige.

I tried getting some pcaps in monitor mode, but they weren't very interesting. Just seemed that the ESP32 wasn't sending anything after the disconnect.

I'm happy that I've at least found a workaround (disconnect(true) + begin(ssid, passwd)).

zhangyanjiaoesp commented 6 years ago

@vseven @Markusenz @sidoh Sorry to reply later, and thanks for your hard work on the SDK. The attachment is the debug version of wifi lib,you can use it to have a test, then we can analysis your logs to solve the problem. wifi_lib.tar.gz

vseven commented 6 years ago

@igrr - I have tried both open and secure with the same results. I can try to get you a packet capture but the issue is extremely random, it could do it 4 times in a hour or twice in a day so capturing might be very tedious. Now that multiple people are having the same issue however I will try the debug version that @zhangyanjiaoesp posted and if that still doesn't lead you to the solution I'll try the packet capture.

Will auto reconnect be fixed to work as intended? It sounds like @sidoh is getting around it which I have not had much luck with. Only rebooting seems to get it around these disconnect loops.

I'll get this recompiled and loaded this weekend.

EDIT: The wifi_lib.tar.gz file is empty. Can you please repost?

sidoh commented 6 years ago

I think the most interesting thing I noticed was that:

WiFi.disconnect(true);
WiFi.begin(SSID, PASSWORD);

always eventually reconnects for me (the true param to disconnect is important). I think WiFi.disconnect(true) turns off the radio. My guess is that under certain circumstances some state about the network the ESP32 is supposed to be connecting to gets cleared, and the above resets it.

Also interesting is that the issue appears to be 100% reproducible by turning off the wifi network, waiting for the ESP32 to notice it has disconnected, and turning the network back on. If the network reappears before the ESP32 notices, it's sometimes able to reconnect successfully. From the logs, the difference appears to be that a NO_AP_FOUND event is triggered in the former case, but not the latter.

zhangyanjiaoesp commented 6 years ago

@vseven wifi_debug_lib.zip

sidoh commented 6 years ago

Re-ran the code linked with the debug libs. Logs here:

This was all on an open network.

edit - link to raw logs instead of gist page.

vseven commented 6 years ago

@zhangyanjiaoesp - Any news on this? I haven't tried the debug code because @sidoh already did and provided lots of logs already but then again it's been a couple more weeks and no real answers.

zhangyanjiaoesp commented 6 years ago

@sidoh In the slow_ap_recreate.log ,we find there are print like

[1513308152]ssid=
Attempting to reconnect...

it seems like your ssid is empty, so after scanning all channels, return no ap found. so you should check your code to see whether the ssid is set correctly. screenshot from 2018-01-04 17-55-20

sidoh commented 6 years ago

Hi @zhangyanjiaoesp,

I think this is probably the meat of the issue. In the slow_ap_recreate.log, the reconnection code is just doing (full sketch here):

WiFi.begin();

so it's not supplying an AP. My understanding of the SDK is that it should just reconnect to the last known AP. Perhaps if it can't find the last used SSID, it clears the config? Is that expected?

Using this instead seems to always work:

WiFi.disconnect(true);
WiFi.begin(ssid, passwd);

edit - just to make sure the setup is clear, the case where reconnecting is not working is when the AP disappears for long enough for the SDK to decide that it's gone. Maybe this is expected behavior, though.

zhangyanjiaoesp commented 6 years ago

@sidoh Yes, you are right. This is expected behavior, you can workaround it by your method:

WiFi.disconnect(true); WiFi.begin(ssid, passwd);

sidoh commented 6 years ago

I see, this makes sense. I'll try to reproduce without the WiFi.begin() in the next couple of days. Think the right thing is to just have the SDK handle reconnection.

Where is reconnection handled in the SDK? Only reference I can find is here, which seems to use WiFi.begin(). Is that expected?

zhangyanjiaoesp commented 6 years ago

yes, you are right

sidoh commented 6 years ago

Doesn't that mean that the SDK code handling reconnections would have the same issue my demo has? If it tries to reconnect after the AP has disappeared, the SSID will get cleared and it won't be able to reconnect unless the AP info is re-supplied?