letscontrolit / ESPEasy

Easy MultiSensor device based on ESP8266/ESP32
http://www.espeasy.com
Other
3.26k stars 2.21k forks source link

Fix rules upload limit by changing form enctype #3868

Open CurlyMoo opened 2 years ago

CurlyMoo commented 2 years ago

The ESP8266Webserver handles the multipart/form-data differently from the default application/x-www-form-urlencoded enctype.

The multipart/form-data is post data is fully buffered in memory before exposing it while the application/x-www-form-urlencoded is a stream. That's the whole point why the firmware uploads do work fine, but the regular uploads fail when the content gets too big.

Switching to multipart/form-data as the rules form enctype will remove that size limit, because the uploaded ruleset can just be written directly to the filesystem and parse efficiently from there. It has been proposed to just upload large rulesets to the filesystem directly from the Arduino software. This would remove the need for that dirty-fix.

I wrote before in this issue https://github.com/letscontrolit/ESPEasy/issues/3164 that i wrote a rules library myself for better rules implementation. I did the same with the webserver: https://github.com/CurlyMoo/webserver/

My webserver can be used using a lot less memory because it hardly buffers anything. That means both multipart/form-data and application/x-www-form-urlencoded are processed as streams. It also means you don't have to buffer a whole webpage for sending, but only buffer small parts at a time when not using PROGMEM. Another major feature is that it can run both synced or fully asynced (using LWIP) without having to recompile.

I don't expect you to switch, but then at least change the form enctype for rules parsing so the size limit can go.

TD-er commented 2 years ago

When serving a web page, there is hardly anything buffered as we send almost everything via chunked transfer. To overcome the issue with POST'ing the rules form, we use a small JavaScript code to actually upload it to the ESP. This means the size limit for the rules is not a hard limit anymore, but I left it there to attend users trying to keep the rules small as processing the rules is currently still taking quite some resources as you of all people know best :) I still need to make some time to properly look into your code to see how we can integrate it in ESPEasy as I still believe parsing pre-compiled binary code is the way to go for rules handling.

I will have a look at what you did with the webserver part to see how we can improve the POST of other pages as there are some pages which do have quite a lot of text to POST when submitting a page.

CurlyMoo commented 2 years ago

When serving a web page, there is hardly anything buffered as we send almost everything via chunked transfer.

I didn't mean buffering in the HTTP protocol itself, but on the ESP side. The current webserver expects you to fully serve a webpage in a single go. Therefor, blocking any operation until it's all sent. Secondly, any dynamic content should be buffered in memory if you want to send it with the same request. And the bigger the page, the more buffers needed. Whilst you can also solve it like i did in the HeishaMon: https://github.com/IgorYbema/HeishaMon/blob/d4f82e9b7a72935eb9ca8b68a4a4192023b9295b/HeishaMon/webfunctions.cpp#L826

} else if (client->content >= 2 && client->content < size+2) {
    webserver_send_content_P(client, PSTR("<option value=\""), 15);

    char str[20];
    itoa(client->content-2, str, 10);
    webserver_send_content(client, str, strlen(str));

    webserver_send_content_P(client, PSTR("\">"), 2);

    webserver_send_content_P(client, tzdata[client->content-2].name, strlen_P(tzdata[client->content-2].name));
    webserver_send_content_P(client, PSTR("</option>"), 9);

The client->content increases each ESP loop until all information is sent. The content is placed in a ordered linked list so the webserver knows what still needs to be sent each loop iteration before it requests the next block to sent. This makes sending dynamic content super efficient.

To overcome the issue with POST'ing the rules form, we use a small JavaScript code to actually upload it to the ESP.

You don't need to use these kinds of dirty fixed when just moving to multipart/form-data. This is a feature implemented in the current ESP8266Webserver where they handle both enctypes differently.

CurlyMoo commented 2 years ago

I still need to make some time to properly look into your code to see how we can integrate it in ESPEasy as I still believe parsing pre-compiled binary code is the way to go for rules handling.

In regard to this topic i can say i'm controlling my heatpump for months now without any issues with the rules library (in a standalone application).

TD-er commented 2 years ago

Please have a look at the Web_StreamingBuffer class. This is a fixed size buffer which does flush itself to the connected client as soon as it's full, or when the page is finished.

Still the page handling functions are blocking, but it is for sure not completely allocated in memory. Some pages when served are over 40k in size, which for sure cannot be held in memory.

So I do have wrapped all sending in separate functions to use html_add which does flush it to the TXBuffer object. These helper functions all have various function parameter signatures like accepting const String& and also flash strings to minimize the number of conversions to memory allocated strings as much as possible. So I think (unless I totally don't get what you mean, which is possible) we're doing exactly the same only now with TXBuffer object which does the flushing for us.

About the dirty tricks... Totally agree, but I wonder how long this is present in the ESP8266Webserver code as I have looked at it before of course or else I didn't have to come up with such a "dirty fix" as you put it so elogently ;)

CurlyMoo commented 2 years ago

as you put it so elogently

I even call the ESP8266Webserver a dirty fix on its own 😉

TD-er commented 2 years ago

I still need to make some time to properly look into your code to see how we can integrate it in ESPEasy as I still believe parsing pre-compiled binary code is the way to go for rules handling.

In regard to this topic i can say i'm controlling my heatpump for months now without any issues with the rules library (in a standalone application).

I think the biggest hurdle for me is to know where to start.... If you have like a pull request on how to integrate your rules preprocessor and interpreter, then I think it would be easier for me to get started. N.B. I do want to have it "optional" or more like modular, so I can decide to include it or not and maybe even the other way around to exclude the old code if the binary precompiled code can be shared among nodes. All with making as small builds as possible to regain the minimal OTA builds.

CurlyMoo commented 2 years ago

If you have like a pull request on how to integrate your rules preprocessor and interpreter, then I think it would be easier for me to get started.

I will be making that PR for the HeishaMon, maybe we can move that discussion to slack when that's ready?

TD-er commented 2 years ago

Sure

CurlyMoo commented 2 years ago

Please have a look at the Web_StreamingBuffer class.

Good point. I overlooked that sendContent feature. Only issue i didn't see a solution for is on how to keep track on where you are in sending.

TD-er commented 2 years ago

The only "problem" here is that you can only handle one request at a time as the TXBuffer object is global and there is only one. For now it doesn't seem to be an issue, but it does become an issue if you plan on keeping connections open like on web sockets. That's also why we actively fetch updates like on the Devices page and the web log.

It would be great if we could support web sockets, but that does require a rather large (HUGE) overhaul of the code and I plan to only do that when we're moving towards a single-page application where the ESP may interact with the frontend via a REST interface.

CurlyMoo commented 2 years ago

The only "problem" here is that you can only handle one request at a time as the TXBuffer object is global and there is only one.

That's what my code tries to solve. The sendlist is client specific. The sendlist is one of the mechanisms and the client->content counter the second. So you can dynamically choose how much data is sent each loop. More chunk can imply bigger memory consumption for the send list.

It would be great if we could support web sockets, but that does require a rather large (HUGE) overhaul of the code and I plan to only do that when we're moving towards a single-page application where the ESP may interact with the frontend via a REST interface.

Websockets is something i want to implement in my webserver as well, but i want to implement the rules system in the HeishaMon first.