esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.09k stars 13.33k forks source link

Debug output says wifi connected, but WiFi.status() keeps returning 7 #9060

Open php4fan opened 11 months ago

php4fan commented 11 months ago

This seems to happen randomly (but often) on some networks. I'm seeing this on a particular wifi network almost systematically (but not quite) while the same code on the same board had worked for ages on other networks.

This is my code (simplified):

void setup() {
  String client_ssid = "mySSID";
  String clien_password = "mypassword";

  WiFi.mode(WIFI_STA);
  WiFi.begin(client_ssid.c_str(), client_password.c_str());

  for (int i=0; i<20; i++) {
    Serial.println("Trying to connect to Wifi '%s' with password '%s'\n", client_ssid.c_str(), client_password.c_str());
    delay(1000)
    if (WiFi.status()==WL_CONNECTED) {
      Serial.println("...connected succesfully\n");
      // Here I would normally do stuff...

      return;
    }
  }
  Serial.println("Cannot connect to WiFi!\n");
  return;
}
fpm close 3 
mode : sta(80:7d:3a:3c:e2:1d)
add if0
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
scandone
state: 0 -> 2 (b0)
Trying to connect to Wifi 'CASAS' with password '******'
state: 2 -> 3 (0)
state: 3 -> 5 (10)
add 0
aid 17
cnt 

connected with CASAS, channel 6
dhcp client start...
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
pm open,type:2 0
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Trying to connect to Wifi 'CASAS' with password '******'
Cannot connect to WiFi!

Note where it says "Connected to ... channel 6": that's not from my println()s, that's some internal output.

It was able to connect once or twice on the very same network, which is working fine and the password is correct, obviously.

I'm reporting it here because there's internal debug output that says "connected" and yet status() is returning WL_DISCONNECTED, so that's an inconsistency that is obviously in software. Either the debug output that says "connected" is bogus, or status() is returning the wrong value (or it connects and immediately loses connection, and there's no debug output for that which would be a bug too - but that seems pretty unlikely).

php4fan commented 11 months ago

I've added a println() for the returned status and it's 7 all the time.

php4fan commented 11 months ago

Ok I think the apparent inconsistency actually has an explanation.

It seems that DHCP is taking forever. The debug output "connected with XXXX, channel Y" is from the SDK and is printed after bare wifi connection is established but before DHCP starts. Then DHCP takes forever, and my code times out before it reaches WL_CONNECTED status which is only reached when DHCP is done, as expected.

So that makes sense. The issue is why does DHCP take so long in the first place. I increased the timeout and still can't connect most of the times (sometimes it does). It's not an issue with the router, no other devices have problems connecting and the router is very close.

mcspr commented 11 months ago

It's not an issue with the router, no other devices have problems connecting and the router is very close.

Do you mean other esp8266 boards or simply other wifi-enabled devices? What is the model of your router?

The debug output "connected with XXXX, channel Y" is from the SDK and is printed after bare wifi connection is established but before DHCP starts. Then DHCP takes forever, and my code times out before it reaches WL_CONNECTED status which is only reached when DHCP is done, as expected. The issue is why does DHCP take so long in the first place. I increased the timeout and still can't connect most of the times (sometimes it does)

WiFi connection status here means exactly that - either netif has IP, or does not. I suppose you could also try to set up static IP and check whether network layer is functional.

There are several issues mentioning (e.g. https://github.com/esp8266/Arduino/issues/8950#issuecomment-1640277771) G mode as a solution, but so far I don't really see a pattern. Without being able to reproduce it is quite difficult to understand the issue here; is it something wrong with DHCP client, something incorrectly done in lwip1 <-> lwip2 layer, or something bad happening in lwip1 SDK itself, or the SDK WiFi driver is to blame. We simply don't know.

php4fan commented 11 months ago

Do you mean other esp8266 boards or simply other wifi-enabled devices?

Simply other wifi-enabled devices, such as smartphones, a tablet, and laptops.

BTW I reproduced the issue with two distinct (but identical) ESP8266's .

What is the model of your router?

Technicolor aghp.

Unfortunately this is at my mother's home where I am only for a few days, far from where I live, so unless I can get a router of the same model for myself (which I'm willing to purchase if I can find it), I won't be able to do any further tests.

php4fan commented 11 months ago

I have changed the router's configuration from "b/g/n" to "b/g" (no changes on the client) and it's gone from almost never connecting to so far always successfully connecting several times in a row.

It could be just a coincidence, as I had already observed the issue coming and going randomly.

If it's not a coincidence, this, combined with the fact that WiFi.setPhyMode(WIFI_PHY_MODE_11G) has been reported to help, raises so many questions:

1

It has been said that the ESP8266 only supports modes B and G, not N. If so, why on earth is there also a WIFI_PHY_MODE_11N??

2

Certainly the ESP8266 knows that it only supports modes B and G, right? Certainly, whatever modes it supports, it tells the AP that during negotiation, right? So, there's no way the AP decides to use mode N (or whatever mode that the ESP doesn't support) and that prevents the connection from going well. Righty? So why should forcing mode G on the client help? Why should excluding mode N on the AP help?

3

Maybe, because of a bug, if you don't call setPhyMode() to force a particular mode, the ESP connects to the AP without saying what modes it supports, so it may in fact happen that the AP chooses a mode that the ESP doesn't support (e.g. N), and therefore the connection doesn't work. Is this the hypothesis behind the idea of forcing mode G? If so, is this hypothesis compatible with the fact that we get in the output:

connected with Whatever, channel X
dhcp client start....

That seems weird: I would expect it to be unable to reach "connected with...".

4

If WiFi.setPhyMode(WIFI_PHY_MODE_11G) forces mode G, and WiFi.setPhyMode(WIFI_PHY_MODE_11B) forces mode B (and WiFi.setPhyMode(WIFI_PHY_MODE_11N) forces mode N which I don't understand since it supposedly doesn't support it), then how would one tell it to do its default thing which is to not force any mode in particular (meaning, I guess, negotiate one with the AP)? That's really more of a curiosity than a question relevant to finding a solution.

5

Should setPhyMode() be called before or after Wifi.begin()?. This should be an easy one. Let's at least not multiply by 2 the number of combinations that we need to try.

asturcon3 commented 11 months ago

1&2&3 ESP8266 for sure supports mode N. I have a lot running happily. The fact (suspicion) that there are some AP's trying to steer devices from 2.4N to 5G causing issues doesn't change that.

AFAIK the dialog on connection is just the oposite. The AP informs about supported modes, the client (the ESP) chooses one. Recently added, you can get that information (supported modes) when scanning for wifis. Then you can pick one to connect.

4 AFAIK there is no WiFi.setPhyMode(WIFI_PHY_MODE_AUTO). I was looking for that. Lacking that, I simply cycle PhyMode. Each time I get a connection error or strange behaviour, I try the next mode. So a remote device settles on B, others on N, who cares.

5-Before

d-a-v commented 11 months ago

It seems that DHCP is taking forever.

One can also check and understand network packets exchange including DHCP from the ESP pov with the provided netdump library.

jorgechurriana commented 8 months ago

Did you find a solution. A have a similar Isuee. All works fine, but sometimes neves connects until reset the router.

php4fan commented 8 months ago

Did you find a solution.

Unfortunately, no.

MartinssG commented 6 months ago

Hi, you can try to adjust the connection part of the code -

    WiFi.begin(ssid.c_str(), password.c_str());
    if (WiFi.waitForConnectResult() != WL_CONNECTED){
      Serial.println(WiFi.status()); 
      Serial.print(' ');
      Serial.println("Failed to connect to WiFi. Entering AP mode...");
      startAPMode();
    } else{
      Serial.println("Successfully connected to WiFi");
      connectedToWifi = true; 
    }

It waits for the esp8266 to connect to the wifi without the for loop, fixed a similar problem for me .