esp8266 / Arduino

ESP8266 core for Arduino
GNU Lesser General Public License v2.1
16.04k stars 13.33k forks source link

SDK issues (SDK reverted from pre3 to 2.2.1) #5784

Closed d-a-v closed 1 year ago

d-a-v commented 5 years ago

This issue keeps track of underlying espressif SDK questions

Current SDK in master branch is nonos-sdk-2.2.1, as shipped in core-2.4.2.

This espressif nonos-sdk 2.2.1 has a WiFi sleep bug (ref: #2330) which is partly solved with a pre-version of espressif nonos-sdk-v3. Workarounds exist with this issue, like regularly sending gratuitous ARP.

The unofficial nonos-sdk-pre-v3.0.0 espressif firmware shipped with arduino-core 2.5.0 has two issues that leaded us to revert back to 2.2.1:

Why not migrating to official nonos-sdk-v3 now ?

FWdeveloper commented 5 years ago

I am looking for WPA2-Enterprise implementation for Arduino. As I understand from the initial comment:

If so, is it possible to provide the rough estimates about RTOS SDK availability on Arduino?

devyte commented 5 years ago

@FWdeveloper correct: nonos sdk doesn't ahve full WPA2-E support. And not correct: FreeRTOS sdk also has incomplete WPA2-E support. I understand a first version is expected in FreeRTOS SDK v3.2, but that was a comment from some time ago, and current status could be different. About rough estimate for migration of our core to FreeRTOS sdk, it's too early even for a rough estimate. I would like to do a full migration, and the underlying architectures and build systems are far too different for a quick solution, so it will take a long time. I can say full migration will likely take longer than a year.

FWdeveloper commented 5 years ago

@devyte Thank you for your comment. A year is a long story :(

What I am thinking can do in this case:

devyte commented 5 years ago

@FWdeveloper the .h is not enough, it's just the declarations. The definitions and other underlying code are in closed-source libs provided by Espressif, so nothing we can do there. And no, there is no open-source implementation of WPA2-E. The relevant code is tightly related to the low level wifi comms, and that's in closed-source libs that belong to Espressif. We have no access to that. The whole WPA2-E is a long discussion, with no current complete solution. There are other issues covering it, here as well as in the NONOS SDK and FreeRTOS repos, as well as in forums. I suggest not hijacking this thread, which is meant to cover the pre3 vs. 2.2.1 versions of the NONOS SDK, and how to address the immediate issues.

FWdeveloper commented 5 years ago

@devyte Yes, I meant C/C++ source files along with wpa2_enterprise.h header used to build libwpa2.a library. Thank you for clarification.

d-a-v commented 5 years ago

This is again turning into a WPA2-Enterprise discussion :) You can try this sketch if your AP is using MSCHAPv2 auth-phase-II (https://github.com/espressif/ESP8266_NONOS_SDK/issues/133) We will not provide any help, we don't own this api nor have sources for it. You have to use it as it is for now.

5chufti commented 5 years ago

it would be nice to have something like a "known problem list" accompanying all the "pros" for the relevant version in the releases list. So it would be much easier to choose a best fitting version on project requirements.

devyte commented 5 years ago

That would be awesome, except that we iron out all the big known problems prior to a base release in the betas (at least we've been trying very hard to that), and then problems like this one are found and reported after the base release, at which time the release notes are already done....

devyte commented 5 years ago

Right now we need serious help figuring out what the problem is, what solution or workaround can be implemented with the current sdk pre3, and/or how to migrate to sdk3.

Aircoookie commented 5 years ago

Hello, I have too encountered this issue in my application and recommended my users to stay on 2.4.2 for the time being. While I can not help in resolving the cause of the problem, I tested the application using 2.5.0 with all my 17 Wemos D1 boards and can provide some data. 5 out of the 17 ESPs are affected by the issue and it seems like there is a possible correlation to the mac address range:

MAC Affected by issue?
5ccf7ffbc4f6 Yes
60019423aa54 No
60019406642b No
60019423b441 Yes
807d3a3beef7 No
840d8e85eaa2 No
840d8e85ef0e No
84f3eb734a3a No
a020a6038bda No
a020a616f9e7 No
a020a6171de5 No
b4e62d44bdb3 No
bcddc22461f3 Yes
cc50e30847d8 Yes
cc50e3454c88 Yes
ecfabc20bb2e No
ecfabc20ff67 No

Only device falling out of the pattern is 60019423b441 which is affected, even though the other 2 devices with 600194 mac range are working as intended, so my results might be a coincidence. All devices got flashed the exact same image and tested one-after-the-other each with two different power sources, so that can't be the source of the issue. Each device works flawlessly using core v2.4.2.

Let me know if it'd help if I did some more tests.

TD-er commented 4 years ago

Just an update of more recent tests on core 2.6.1 builds with SDK2.2.2 and SDK3. Some users report their boards are unable to reconnect to WiFi without reboot. This mainly happens on low RSSI values and also a scan does report significant less available APs in the area compared to right after a (cold?) boot.

SDK3 builds do seem to perform better here.

d-a-v commented 4 years ago

@TD-er What do you think of this post ? Would you be able to confirm these findings ?

Also, you were the first one reporting performance issues with SDK3, I assume these are still there ?

TD-er commented 4 years ago

Well my weather station outside uses this: ESP82xx Core 2.6.0-dev, NONOS SDK 2.2.2-dev(38a443e), LWIP: 2.1.2 PUYA support That's probably core 2.6.0 with the SDK22x_190703 (built Sep 24 2019 00:54:21)

And that one is really really stable with a wide range of weather conditions (really wet, but also internal temps of up-to 70C and down to freezing temps)

System info Weather station
Local Time: 2019-11-22 12:30:31
Uptime: 59 days 12 hours 31 minutes
Load: 32.90% (LC=295)
CPU Eco Mode: true
Free Mem: 11600 (7264 - LoadControllerSettings)
Free Stack: 3568 (1136 - LoadTaskSettings)
Heap Max Free Block: 8848
Heap Fragmentation: 28%
Boot: Cold boot (0)
Reset Reason: Software/System restart
Last Task: Background Task
SW WD count: 0

Network ❔ Wifi: | 802.11G (RSSI -63 dB) IP Config: | DHCP IP / Subnet: | 192.168.1.145 / 255.255.255.0 Gateway: | 192.168.1.1 Client IP: | 192.168.1.140 DNS: | 192.168.1.1 / (IP unset) Allowed IP Range: | 192.168.1.0 - 192.168.1.255 STA MAC: | 84:F3:EB:82:0B:75 AP MAC: | 86:F3:EB:82:0B:75 SSID: | Lurch4 (9C:C7:A6:C7:12:F6) Channel: | 6 Connected: | 24d07h39m Last Disconnect Reason: | (200) Beacon timeout Number Reconnects: | 2

As you can see, the RSSI is nowhere near the -35 dB, so it is a good test I guess.

About the performance issues. I have not run SDK3 on my nodes a lot, so I cannot tell if it is still an issue. One of the users running SDK3 reported WD reboots, which may be due to performance, but they could very well be caused by lots of other issues. On core 2.6.x SDK2xx the stability has seen enormous improvements compared to core 2.5.x and 2.4.x.

Seeing this report of @ascillato I think I will also move to the SDK of July. I am now building using the SDK of November, but that may look like it coincides with the WiFi reconnect issues reported.

adbensi commented 4 years ago

Hello, I have SDK 3 and ESP8266 work better with this, I have ESP8266WebServer library. I trying to use ESP8266WebServer with Ethernet W5500 module on the ESP8266. It is very hard, I have issues yet.

Did you know if exist this Project ? ### ESP8266WebServer with FSBrowser SPIFFS support on the Ethernet W5500 module ?

Guys, I would like to do more, how I can help this Project too? Best Regards

devyte commented 4 years ago

@adbensi said:

I have SDK 3 and ESP8266 work better with this

Can you please explain exactly what you did and what you tested?

Did you know if exist this Project ?

It does, but it's not merged yet. There is a PR where Ethernet support is integrated with lwip, so the underlying sockets used are WiFiClient/WiFiServer instead of the ones in the Ethernet lib, and with that you can use the normal webserver.

adbensi commented 4 years ago

Hi devyte,

Can you please explain exactly what you did and what you tested? Of course,

I got better stability, the WI-FI signal did not respond for a few seconds, and this does not happen anymore, with the same module and code.

The average time of ICMP packets has become smaller, and more stable.

If I compile in the version 2, I see a greater variation between the average and maximum times.

It does, but it's not merged yet.

I have issues on the funcion: size_t contentLength = _currentClient.write(file);

on the EthernetWebserver.h. It talk to :

size_t EthernetClient::write(uint8_t b) return write(&b, 1);

on the EthernetClient.cpp to talk to:

size_t EthernetClient::write(const uint8_t *buf, size_t size)

and if (Ethernet.socketSend(sockindex, buf, size)) return size; return only one byte of data, because the fixed parameters on the EthernetClient::write(uint8_t b).

All others messages work great.. only StreamFile not work, but, why it was done by this way ?

devyte commented 4 years ago

About Ethernet, let's not go off topic. If you want to discuss further details about that, please look me up in our gitter channel. I would like to know:

  1. exactly how you built your sketch
  2. what sketch you used for testing, or whether you can provide a MCVE that shows what you say about better stability

Please also do a loop count test, i. e. loops per second.

adbensi commented 4 years ago

I will do this, now I know how to measure others parameters too.

I thank you for offering help, I need to understand how to solve the transfer of a SPIFFS file over ethernet on the ESP8266.

Dump-kamklue commented 4 years ago

This issue keeps track of underlying espressif SDK questions

Current SDK in master branch is nonos-sdk-2.2.1, as shipped in core-2.4.2.

This espressif nonos-sdk 2.2.1 has a WiFi sleep bug (ref: #2330) which is partly solved with a pre-version of espressif nonos-sdk-v3. Workarounds exist with this issue, like regularly sending gratuitous ARP.

The unofficial nonos-sdk-pre-v3.0.0 espressif firmware shipped with arduino-core 2.5.0 has two issues that leaded us to revert back to 2.2.1:

  • Some boards show erratic behavior (radio connection is quickly lost), with an unknown cause. These boards work well with previous nonos-sdk-2.2.1 firmware (#5736)
  • Overall performances have decreased (#5513)

Why not migrating to official nonos-sdk-v3 now ?

  • arduino core uses unofficial hacks and API with nonos-sdk. These hacks currently do not work with latest versions (v3.0.0+) of this sdk.
  • nonos sdk is claimed to be EOL by Espressif, meaning that we can never officially ask for feature support like full WPA2-Enterprise (rtos, nonos) (refs: 1 2)
  • Migrating to RTOS-SDK is being under serious consideration disclaimer: not saying that esp8266 arduino-core will be RTOS - we can always run in a single task / single stack while being run by an RTOS (I personally think this is also the nonos-sdk model) - That, because we are short in main RAM
5chufti commented 3 years ago

any news on the feasability of the SDK release 3.0.4 with latest commits on the espressif github? They seem to have spent quite some work to an abandoned firmware ... 22 commits since 3.0.3 including another release (3.0.4)

metarutaiga commented 1 year ago

I tried 3.0 and search what the problems are.

  1. Before feature/add_partition_table, the efa7981 we can used it without modified code.
  2. After feature/add_partition_table we added system_partition_table_regist, the 69babe9 is running normally.
  3. After bugfix/workaround_ota, the app_main is broken.
  4. After the 2717db3, the app_main is fixed, but WiFi driver is broken.

Finally, we can use 3.0.5(b29dcd3) and app_main of 3.0.3 to run it.