ayushsharma82 / ElegantOTA

OTA updates made slick and simple for everyone!
https://elegantota.pro
GNU Affero General Public License v3.0
634 stars 116 forks source link

kernel panic / crash when using two ElegantOTA instances and Soft AP #164

Closed chconnor closed 9 months ago

chconnor commented 9 months ago

I posted this question about using more than one ElegantOTAClass instance. I looked through the code and it seemed like it would be fine, so I went ahead.

Unfortunately, there is a crash and reboot when I try to access the /update URL for the Soft AP webserver.

Below is code demonstrating the issue. This is on an ESP32 DevKitC-VIE, Arduino framework, ElegantOTA 3.1.0

What works: STA webserver URLs / and /update AP webserver URL /

What crashes: AP webserver /update

If I only make the Soft AP webserver (no STA webserver) it works fine for / and for /update. It's only when both webservers are active, and when you go to the /update URL on the Soft AP webserver.

All other web pages that I serve from either web server, with or without authentication, work fine.

Demo code:

#include <Arduino.h>
#include <WiFi.h>
#include <WebServer.h>
#include <ElegantOTA.h>

WebServer *APwebserver, *STAwebserver;

void handleurlSTA()
{
  if (!STAwebserver->authenticate("username", "password"))
  {
    STAwebserver->requestAuthentication();
    return;
  }

  STAwebserver->send(200, "text/html", "STA webserver");
}

void handleurlAP()
{
  if (!APwebserver->authenticate("username", "password"))
  {
    APwebserver->requestAuthentication();
    return;
  }

  APwebserver->send(200, "text/html", "AP webserver");
}

void setup()
{
  Serial.begin(115200);
  while (!Serial) delay(10);

  WiFi.mode(WIFI_AP_STA);
  WiFi.softAP("softap", "password");
  WiFi.begin("wifiusername", "wifipassword");

  while(WiFi.status() != WL_CONNECTED)
  {
    Serial.print(".");
    delay(200);
  }

  Serial.println(" connected");
  Serial.printf("STA IP: %s\n", WiFi.localIP().toString().c_str());
  Serial.printf("Soft AP IP: %s\n", WiFi.softAPIP().toString().c_str());

  APwebserver = new WebServer(WiFi.softAPIP(), 80);
  STAwebserver = new WebServer(WiFi.localIP(), 80);
  APwebserver->on("/", handleurlAP);
  STAwebserver->on("/", handleurlSTA);

  ElegantOTAClass ElegantOTA2;
  ElegantOTA.begin(STAwebserver, "username", "password");
  ElegantOTA2.begin(APwebserver, "username", "password");

  STAwebserver->begin();
  APwebserver->begin();
}

void loop()
{
  STAwebserver->handleClient();
  APwebserver->handleClient();
}
Guru Meditation Error: Core  1 panic'ed (LoadProhibited). Exception was unhandled.

Core  1 register dump:
PC      : 0x400db02b  PS      : 0x00060930  A0      : 0x800db179  A1      : 0x3ffca890  
A2      : 0x0000010d  A3      : 0x3ffca92c  A4      : 0x0000000e  A5      : 0x0000ff00  
A6      : 0x00ff0000  A7      : 0xff000000  A8      : 0x800db008  A9      : 0x3ffca870  
A10     : 0x00000000  A11     : 0x0000000e  A12     : 0x0000000f  A13     : 0x3ffc3b94  
A14     : 0x3ffc3b94  A15     : 0x3ffd68b8  SAR     : 0x00000017  EXCCAUSE: 0x0000001c  
EXCVADDR: 0x0000011c  LBEG    : 0x4008a688  LEND    : 0x4008a693  LCOUNT  : 0xffffffff  

Backtrace: 0x400db028:0x3ffca890 0x400db176:0x3ffca8b0 0x400d6eb9:0x3ffca8d0 0x400d9b11:0x3ffca960 0x400d749b:0x3ffca9b0 0x400d7541:0x3ffca9d0 0x400d7627:0x3ffcaa10 0x400d77d2:0x3ffcaa80 0x400d3001:0x3ffcaad0 0x400dc8c5:0x3ffcaaf0

  #0  0x400db028:0x3ffca890 in String::move(String&) at /home/myusername/.platformio/packages/framework-arduinoespressif32/cores/esp32/WString.cpp:234
  #1  0x400db176:0x3ffca8b0 in String::operator=(String&&) at /home/myusername/.platformio/packages/framework-arduinoespressif32/cores/esp32/WString.cpp:277
  #2  0x400d6eb9:0x3ffca8d0 in WebServer::requestAuthentication(HTTPAuthMethod, char const*, String const&) at /home/myusername/.platformio/packages/framework-arduinoespressif32/libraries/WebServer/src/WebServer.cpp:240
  #3  0x400d9b11:0x3ffca960 in std::_Function_handler<void (), ElegantOTAClass::begin(WebServer*, char const*, char const*)::{lambda()#1}>::_M_invoke(std::_Any_data const&) at .pio/libdeps/esp32dev/ElegantOTA/src/ElegantOTA.cpp:29
      (inlined by) _M_invoke at /home/myusername/.platformio/packages/toolchain-xtensa-esp32/xtensa-esp32-elf/include/c++/8.4.0/bits/std_function.h:297
  #4  0x400d749b:0x3ffca9b0 in std::function<void ()>::operator()() const at /home/myusername/.platformio/packages/toolchain-xtensa-esp32/xtensa-esp32-elf/include/c++/8.4.0/bits/std_function.h:687
  #5  0x400d7541:0x3ffca9d0 in FunctionRequestHandler::handle(WebServer&, http_method, String) at /home/myusername/.platformio/packages/framework-arduinoespressif32/libraries/WebServer/src/detail/RequestHandlersImpl.h:45
  #6  0x400d7627:0x3ffcaa10 in WebServer::_handleRequest() at /home/myusername/.platformio/packages/framework-arduinoespressif32/libraries/WebServer/src/WebServer.cpp:651
  #7  0x400d77d2:0x3ffcaa80 in WebServer::handleClient() at /home/myusername/.platformio/packages/framework-arduinoespressif32/libraries/WebServer/src/WebServer.cpp:318
  #8  0x400d3001:0x3ffcaad0 in loop() at src/main.cpp:163
  #9  0x400dc8c5:0x3ffcaaf0 in loopTask(void*) at /home/myusername/.platformio/packages/framework-arduinoespressif32/cores/esp32/main.cpp:50

ELF file SHA256: 1ea0d3baef53ddd7

Rebooting...

Given that EXCVADDR is a low value, it would appear that it's looking for a struct member off a null pointer or something? This stack trace comes from the ElegantOTA URL handling call to requestAuthentication(), but the crash in my real code comes through authenticate() (presumably because there is existing auth being sent by my web client) (i.e. line 28 vs line 29 in ElegantOTA.cpp). It's hard for me to understand why it would crash though, since AFAICT everything is a class variable held by the class instances (whether WebServer or ElegantOTAClass).

Thanks for any ideas!

mathieucarbou commented 9 months ago

If using both sta and ap, I don't think you need to have 2 instances of each. You can create a normal ws instance bound on 0.0.0.0 and have it serve both. To decide which endpoint is served you can use filtering and rewriting.

ElegantOTA is a static class, not an object that can be instantiated so there can only be one instance. The consecutive begin calls would overwrite each other.

chconnor commented 9 months ago

Thanks for the response!

I realized I made an incredibly dumb mistake: the declaration of the second ElegantOTAClass object is inside setup() so it is lost on exit of that function... when I move it outside, everything works as expected.

I started out with a central webserver for both but couldn't find a guaranteed way to discriminate between calls to one or the other... it seemed theoretically possible that the Soft AP IP and the WLAN IP could be the same, even if near-zero chance, but maybe that's not true. In hindsight I suppose I could have just used that...

I'm not sure what you meant by "ElegantOTA is a static class"? -- I believe there is an instance of the ElegantOTAClass declared at the end of ElegantOTA.cpp? It seems like the entire class has no static variables or methods? I'm doing: ElegantOTAClass ElegantOTA2; and it seems to be fine now.

Corrected code:

#include <Arduino.h>
#include <WiFi.h>
#include <WebServer.h>
#include <ElegantOTA.h>

WebServer *APwebserver, *STAwebserver;

void handleurlSTA()
{
  if (!STAwebserver->authenticate("username", "password"))
  {
    STAwebserver->requestAuthentication();
    return;
  }

  STAwebserver->send(200, "text/html", "STA webserver");
}

void handleurlAP()
{
  if (!APwebserver->authenticate("username", "password"))
  {
    APwebserver->requestAuthentication();
    return;
  }

  APwebserver->send(200, "text/html", "AP webserver");
}

ElegantOTAClass ElegantOTA2;

void setup()
{
  Serial.begin(115200);
  while (!Serial) delay(10);

  WiFi.mode(WIFI_AP_STA);
  WiFi.softAP("softap", "password");
  WiFi.begin("wifiusername", "wifipassword");

  while(WiFi.status() != WL_CONNECTED)
  {
    Serial.print(".");
    delay(200);
  }

  Serial.println(" connected");
  Serial.printf("STA IP: %s\n", WiFi.localIP().toString().c_str());
  Serial.printf("Soft AP IP: %s\n", WiFi.softAPIP().toString().c_str());

  APwebserver = new WebServer(WiFi.softAPIP(), 80);
  STAwebserver = new WebServer(WiFi.localIP(), 80);
  APwebserver->on("/", handleurlAP);
  STAwebserver->on("/", handleurlSTA);

  ElegantOTA.begin(STAwebserver, "username", "password");
  ElegantOTA2.begin(APwebserver, "username", "password");

  STAwebserver->begin();
  APwebserver->begin();
}

void loop()
{
  STAwebserver->handleClient();
  APwebserver->handleClient();
}
mathieucarbou commented 9 months ago

Sorry for the poor words. I meant to say that ElegantOTA has is provided as a static class for a reason: it is using under the hood resources which cannot be shared with 2 ElegantOTA instances.

It is not impossible what you do, but there is no synchronisation point to make sure there is no concurrent upload at any point in time or even reboot.

In my project, I am using one (ESPAsyncWebServer) instance, it works on both AP and STA mode (but not at the same time) and I am using filters and rewrite to decide which endpoint is activated on which WiFi mode and where it points to.

This is using far less resources (stack and heap) then duplicated all these instances.

chconnor commented 9 months ago

I appreciate the thoughts -- I think in my case it may be tolerable because WebServer is single threaded (with the project's loop() thread being the thread via handleClient()); I didn't see any under-the-hood resources that are shared by multiple ElegantOTAClass instances (ElegantOTAClass, WebServer, WiFiServer, and Server are all fully-self-contained) with the exception of the Update instance that it holds, which could get the wrong MD5 after URL ota/start if a different client subsequently writes via ota/upload. I suppose the Update.begin()/end() calls could also get things confused, but it seems like the worst that could happen is the update fails and has to be redone. (If this was a commercial product I would think otherwise, since I don't understand how the partitioning stuff works.) And of course I need to be sure that my onStart( ) and onEnd() are aware that multiple clients could call them in unexpected orders.

Let me know if I've missed anything!

May I ask how you do your filtering/rewriting? In my case I have 15 URL endpoints that all come to either webserver. The main difference is requiring HTTP auth for the STA connections. As mentioned, my original idea was to discriminate based on the server IP, but I went the way I did because I wasn't certain those were guaranteed to be unique.

mathieucarbou commented 9 months ago

but it seems like the worst that could happen is the update fails and has to be redone

I don't know how the Update class is synchronised and how it copes with concurrent writes and handle md5 computing, but yes this is the shared resource I was talking about.

May I ask how you do your filtering/rewriting?

https://github.com/yubox-node-org/ESPAsyncWebServer?tab=readme-ov-file#using-filters

With ESPAsyncWebServer, there are some filters which can be added to handlers, and handlers can be rewrites. So these 2 combine give a lot of flexibility to change the rules and destination of the endpoints based on any contextual data.

For exemple, in my case I have these handlers:

chconnor commented 9 months ago

Thanks! Looks like they are just comparing IPs as well... maybe my paranoia is unfounded. :-)