Closed 0xFEEDC0DE64 closed 3 years ago
Iam experiencing the same issue. Sending char with text all every sec results in this exact same error and restart
I'm starting to remove all ESPAsyncWebServer code from my project to replace it with espressif provided APIs, I see no future with this webserver if it keeps crashing all over the place after long runtimes, I need rock solid 24/7 ontime, seems like most arduino related library suffer of poor coding quality and bad memory management.
In my investigations on this webserver library here I saw handwritten linked lists in a very strange coding style with callbacks and so on instead of using perfectly fine C++ provided containers. WTF. I am wondering how it kept working so good for so long in my development cycle so far
Good for you! Write your own then. I believe you should first get even a fraction of the popularity of this code before saying "WTF".
I'm just saying the problems could all have been avoided by not writing another container implementation, chances are very low to write a bug free version, c++ std library does a very good job at it with big comittee behind it, porting arduino containers over to std containers would fix crashes for everyone, im under time pressure to finish the project and therefore cannot do it right now
My long journey of this crash finally found an end, I got rid of all the crashes :+1: First of all, I switched from platformio to esp-idf, which freed around 40kb of heap memory (which already took days to port all code over, but I'm still using arduino and ESPAsyncWebserver). Then I started replacing all the linked lists with C++ provided ones, committed my hacks to a branch, and used it as a esp-idf component. I don't know if I broke any of the features I don't use of the webserver, but the websocket and SPIFFS provided html page are rock solid now and never crashed again in the last few days. If anyone is interested in the branch of the webserver, i pushed it to github and might do further cleanups, but I guess a pull-request will never accepted, so I won't really bother creating one in the first place. (please forgive me the branch name, but i was in a little rage when I saw the original implementations):
https://github.com/me-no-dev/ESPAsyncWebServer/compare/master...0xFEEDC0DE64:dumbfixes?expand=1
Hey Daniel, First I've been very skeptic on your mood regarding c++ stl and usage of custom linked lists and array in this server. It is widely used opinion (proven by practice indeed) that stl is not optimized for embedded platform due to its dynamic allocation nature. Also usually usage of stl containers significantly increases size of binary (even on desktops) - so usage of linked lists makes a lot of sense in context of esp32 for example. Also I feel your pain for 200% as already spent year ago say 1 month of bumping into crashes with web socket impl you depicted and have to rollback to blocking sync version of WebSockets for Arduino...
And you know - I tried your version and wow - it actually decreases size of my binary on 9% (170k). Well it took me 30 minutes to switch from default gnu++11 to gnu++17 and still _queue = {} way to clear the queue does not work for me and was replaced with explicit constructors of empty queue, but... it works! I continue to test web server I have and so far it seems no features lost... I would need some time to switch back to async websocket again ) Also want to say that I verified if actual switch to c++17 helped with binary size - and no - it does not help...
Thank you for efforts and decision to write here with your results. Also imo your memory management and threading improvements should be merged from your pull request for everybody for sure...
Interesting thread. I may try out @0xFEEDC0DE64's port too.
In defence to the custom linked-list argument, don't forget how much available DRAM is available on an ESP8266 so any short-cut helps in that area. I recently replaced a few of of the Arduino STL containers I was using in one of my projects with homegrown ones. I wrote a small sketch that loaded up 200 elements into a std::list
/std::queue
/std::vector
and measured the heap memory it used on an ESP8266 as well as the fragmentation. The overhead is huge and by replacing with a light-weight alternative classes I was able to reduce the heap overhead by 15-20% in most cases. I also love using std::function
and std::bind
and lambda's but choose to replace those with ugly old-fashioned C-style function pointers which free'd up another 5%.
Also I used this port of AsyncWebServer https://github.com/sascha432/ESPAsyncWebServer which saves an additional 2K by moving all strings to Flash/PROGMEM.
For ESP32 only check out https://github.com/fhessel/esp32_https_server
I am very interested in any fixes to the stability of this webserver. So, in order to do some code review/cleanup, and also to make the code compatible with the default C++11 setup in Arduino/ESP32, I went through every patch in the dumbfixes branch of @0xFEEDC0DE64 . The result is here: yuboxfixes-0xFEEDC0DE64-cleanup. Besides making the patches C++11 compatible, I wanted to lay out every single fix and understand why each one of them was necessary. I am performing a torture test right now with the yuboxfixes-0xFEEDC0DE64-cleanup branch, and it has not crashed once yet in 3 hours so far (previous attempts crashed with various memory corruption backtraces).
@avillacis thanks for doing that! Have you noticed any differences other than the web-socket stability, such as changes in the memory heap used or firmware/sketch size?
@avillacis thanks for doing that! Have you noticed any differences other than the web-socket stability, such as changes in the memory heap used or firmware/sketch size?
I have not checked RAM usage, but the firmware file is 1025344 bytes before the changes by @0xFEEDC0DE64 , and 1026064 after.
My long journey of this crash finally found an end, I got rid of all the crashes 👍 First of all, I switched from platformio to esp-idf, which freed around 40kb of heap memory (which already took days to port all code over, but I'm still using arduino and ESPAsyncWebserver). Then I started replacing all the linked lists with C++ provided ones, committed my hacks to a branch, and used it as a esp-idf component. I don't know if I broke any of the features I don't use of the webserver, but the websocket and SPIFFS provided html page are rock solid now and never crashed again in the last few days. If anyone is interested in the branch of the webserver, i pushed it to github and might do further cleanups, but I guess a pull-request will never accepted, so I won't really bother creating one in the first place. (please forgive me the branch name, but i was in a little rage when I saw the original implementations):
https://github.com/me-no-dev/ESPAsyncWebServer/compare/master...0xFEEDC0DE64:dumbfixes?expand=1
I'm not sure if I missed something obvious, but I can't compile using the commit referenced here. Please could you outline any changes that are needed to use this branch. Here are some of the issues I ran into - there were a lot of compile errors, so I assume something else needs doing and not just adding the git url to my platformio.ini file. https://github.com/h2zero/NimBLE-Arduino/issues/167#issuecomment-761102439
Kudos to @0xFEEDC0DE64 for the patch and @avillacis for the code cleanup! I was plagued by random crashes right from the start of using this library to something very simple, obtain a JSON from a POST request and deserialize it through ArduinoJSON. It doesn't use websockets. With the latest AsyncWebServer build in PlatformIO I had a crash between 10-120 mins. After replacing it with the patched version, no crashes whatsoever for 24 hours.
Here's the test code:
//
// A simple server implementation showing how to:
// * serve static messages
// * read GET and POST parameters
// * handle missing pages / 404s
//
#include <Arduino.h>
#include <WiFi.h>
#include "src/ESPAsyncWebServer/AsyncTCP.h"
#include "src/ESPAsyncWebServer/ESPAsyncWebServer.h"
#include "src/ESPAsyncWebServer/AsyncJson.h"
#include <ArduinoJson.h>
AsyncWebServer server(80);
// Set your Static IP address
IPAddress local_IP(192, 168, 1, 2);
// Set your Gateway IP address
IPAddress gateway(192, 168, 1, 1);
IPAddress subnet(255, 255, 255, 0);
IPAddress primaryDNS(8, 8, 8, 8); //optional
IPAddress secondaryDNS(8, 8, 4, 4); //optional
const char* ssid = "xxx";
const char* password = "yyy";
const char* PARAM_MESSAGE = "message";
const int stationID = 98;
long tableUptime[3];
uint16_t tableError[3];
byte tableUsbCharging[3];
byte tableUsbCounter[3];
long tableUsbTime[3];
byte tableQiCharging[3];
byte tableQiCounter[3];
long tableQiTime[3];
float tableInputVoltage[3];
float tableUsb1Voltage[3];
float tableUsb2Voltage[3];
float tableQiVoltage[3];
float tableUsbPower[3];
float tableQiPower[3];
bool tableUsbStatus[3];
bool tableQiStatus[3];
uint16_t tableMinWiFiDelay;
uint16_t tableMaxWiFiDelay;
uint16_t tableSleepDelay;
long previousMillis;
void notFound(AsyncWebServerRequest *request) {
request->send(404, "text/plain", "Not found");
}
void setup() {
Serial.begin(115200);
Serial.print("Setting AP (Access Point)…");
// Remove the password parameter, if you want the AP (Access Point) to be open
// Configures static IP address
if (!WiFi.config(local_IP, gateway, subnet, primaryDNS, secondaryDNS)) {
Serial.println("STA Failed to configure");
}
Serial.print("Connecting to AP...");
int wait_time = 0;
WiFi.mode(WIFI_STA);
WiFi.begin(ssid, password);
delay(150);
char devicename[20]="Station-";
char tempValue[5];
itoa(stationID,tempValue,10);
strcat(devicename, tempValue);
WiFi.setHostname(devicename);
delay(500);
while (WiFi.status() != WL_CONNECTED) {
delay(500);
Serial.print(".");
if(wait_time > 10) break;
wait_time++;
feedLoopWDT();
}
if(WiFi.status() == WL_CONNECTED) { //WiFi AP is available, go ahead
Serial.println("succeeded");
} else { //WiFi not available, increment failed counter
Serial.println("failed");
WiFi.mode(WIFI_OFF); //switch it off since it didn't connect. Will try again later
}
webServer();
}
void loop() {
long curMillis = millis();
if(curMillis - previousMillis > 10000) {
previousMillis = curMillis;
Serial.print("Uptime: ");
Serial.println(curMillis);
}
}
void webServer() {
// Add a handler to process the JSON contained in POST request table-update
AsyncCallbackJsonWebHandler *handler = new AsyncCallbackJsonWebHandler("/table-new", [](AsyncWebServerRequest *request, JsonVariant &json) {
StaticJsonDocument<1024> receivedDoc;
//DynamicJsonDocument receivedDoc(1024);
if (json.is<JsonArray>()) {
receivedDoc = json.as<JsonArray>();
} else if (json.is<JsonObject>()) {
receivedDoc = json.as<JsonObject>();
}
int tableID = receivedDoc["ID"];
if(tableID > 0 && tableID < 4) { //tables are numbered 1,2,3
tableID=tableID-1; //subtract 1 as we store data in arrays
//we aren't doing data validation as ArduinoJson returns zero if the variable is not big enough to hold the information
tableUptime[tableID] = receivedDoc["uptime"];
tableError[tableID] = receivedDoc["error"]; //when it is zero there are no errors
tableUsbCharging[tableID] = receivedDoc["usbCharging"];
tableUsbCounter[tableID] = receivedDoc["usbCounter"];
tableUsbTime[tableID] = receivedDoc["usbTime"];
tableQiCharging[tableID] = receivedDoc["qiCharging"];
tableQiCounter[tableID] = receivedDoc["qiCounter"];
tableQiTime[tableID] = receivedDoc["qiTime"];
tableInputVoltage[tableID] = receivedDoc["inputVoltage"];
tableUsb1Voltage[tableID] = receivedDoc["usb1Voltage"];
tableUsb2Voltage[tableID] = receivedDoc["usb2Voltage"];
tableQiVoltage[tableID] = receivedDoc["qiVoltage"];
tableUsbPower[tableID] = receivedDoc["usbPower"];
tableQiPower[tableID] = receivedDoc["qiPower"];
request->send(200); //reply with a HTTP 200 OK message
} else {
Serial.println("[HTTP] Error: missing table ID in table-update request");
request->send(400); //reply with a HTTP 400 Bad Request message
}
});
server.addHandler(handler);
// Reply to table-command GET request with a JSON containing the table commands
server.on("/table-send", HTTP_GET, [] (AsyncWebServerRequest *request) {
if (request->hasParam("id")) {
int tableID = request->getParam("id")->value().toInt(); //we need to convert a String value into an integer
Serial.print("[HTTP] sending commands to table ID ");
Serial.println(tableID);
tableID=tableID-1; //subtract 1 as we store data in arrays
//DynamicJsonDocument sentDoc(512);
StaticJsonDocument<512> sentDoc;
if(tableUsbStatus[tableID]) {
sentDoc["usb"] = "on";
} else {
sentDoc["usb"] = "off";
}
if(tableQiStatus[tableID]) {
sentDoc["qi"] = "on";
} else {
sentDoc["qi"] = "off";
}
sentDoc["minWiFiDelay"] = tableMinWiFiDelay;
sentDoc["maxWiFiDelay"] = tableMaxWiFiDelay;
sentDoc["sleepDelay"] = tableSleepDelay;
char response[512];
serializeJson(sentDoc, response);
request->send(200, "application/json", response); //reply with a HTTP 200 OK message
Serial.println(response);
} else {
Serial.println("[HTTP] Error: missing table ID in request");
request->send(400); //reply with a HTTP 400 Bad Request message
}
});
server.onNotFound(notFound);
server.begin();
}
i have the same issue here. The crash is random and unpredictable.
Any progress on this issue? I am finding it easy to reproduce using textAll to send small websocket messages (approx 35-40 bytes) to a client, at approximately 5hz (200ms delay between messages), after about five minutes the ESP32 will crash with a corrupt heap error.
I completely ditched the ESPAsyncWebserver from my project and started using the espressif provided httpd, which also forced to ditch platformio for the esp-idf, and since them I have never encountered any bugs or any problems anymore. The refactoring was huge to switch to the C apis from espressif but was worth all the efforts :)
@0xFEEDC0DE64 and @avillacis, thanks for the reworked library. It's a pity the official repo is outdated and not taking pull request anymore.
@avillacis, would you mind enabling issues in your repo, so as to keep the library evolving somewhere? If you are willing to: Go to the Settings page of your fork. Check the box next to Issues
I have tried to test the library but it seems that the class AsyncWebSocketMessageBuffer has been removed. Any ideas why? For sure for a good reason, but, could you please explain how to circumvent the use of it so I can test the library? Thanks!
[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
@0xFEEDC0DE64 and @avillacis, thanks for the reworked library. It's a pity the official repo is outdated and not taking pull request anymore.
@avillacis, would you mind enabling issues in your repo, so as to keep the library evolving somewhere? If you are willing to: Go to the Settings page of your fork. Check the box next to Issues
Sorry for the late reply. I have been busy with work. Issues on the fork are now enabled.
I have tried to test the library but it seems that the class AsyncWebSocketMessageBuffer has been removed. Any ideas why? For sure for a good reason, but, could you please explain how to circumvent the use of it so I can test the library? Thanks!
As commented in branch yuboxfixes-0xFEEDC0DE64-cleanup
, commit https://github.com/yubox-node-org/ESPAsyncWebServer/commit/4963ce9da983e433d938e8b978327f0eac2ddd8c removes AsyncWebSocketMessageBuffer in order to use instead a std::shared_buffer implementation. This achieves a significant code simplification as well as removing a homebrewed implementation of a smart pointer to a buffer. I am not sure what you mean by "circumvent". What scenario or code path do you intend to exercise?
[STALE_CLR] This issue has been removed from the stale queue. Please ensure activity to keep it openin the future.
@0xFEEDC0DE64 and @avillacis (and others) many thanks for this improved fork! I've found it to be a tremendous improvement in stability with websockets. I hope other people will find this post earlier than I did.
[STALE_SET] This issue has been automatically marked as stale because it has not had recent activity. It will be closed in 14 days if no further activity occurs. Thank you for your contributions.
[STALE_DEL] This stale issue has been automatically closed. Thank you for your contributions.
After a few hours of uptime and lots of calls to textAll, it sometimes crashes with the following stack:
Guru Meditation Error: Core 1 panic'ed (LoadProhibited). Exception was unhandled.