CurlyMoo / webserver

ESP ready low memory webserver
Mozilla Public License 2.0
3 stars 1 forks source link

nasty bug seen while firmware uploading on heishamon #3

Open IgorYbema opened 2 months ago

IgorYbema commented 2 months ago

I found out that there is a nasty bug while using this webserver code in the heishamon project.

Uploading firmware in a single session works fine. But if, during the upload, there is another session (websocket or just a webrequest like the JSON data request), this can cause the firmware upload to fail.

I found out that it breaks here: https://github.com/CurlyMoo/webserver/blob/8abb05f2e1399ef2b8fe07cbbc801833a75e8c11/webserver.cpp#L858

And indeed, the readlen is beyond totallen in such a situation. So it somehow tries to read beyond the end of the data. And I checked a wireshark dump on a succesfull and failed upload. The sent data is exactly the same. So I can only conclude that the 2nd webserver session is somehow corrupting the receiving firmware memory on the 1st webserver session.

I tried with just a webserver request to the JSON url, but also only opening the websocket session (disabled webpage refresh). So it is not related to only websocket.

I can't figure out a way to make a test for this in the unittester as it requires sending and receiving happening at the same time.

CurlyMoo commented 2 months ago

I tried with just a webserver request to the JSON url, but also only opening the websocket session (disabled webpage refresh). So it is not related to only websocket.

I don't quite understand this sentence.

IgorYbema commented 2 months ago

If you open the normal heishamon webpage, it does two things at the same same. It refreshes the page using a 30 sec JSON request (on /json url) and it opens the websocket. I noticed that the firmware update failed if, on another tab, the heishamon website was opened. However try to understand if it was only the websocket or another webrequest, I tried both seperatly. So a seperate websocket test (without the JSON reload active), and also just downloading the JSON info (without websocket active). But it didn't; matter.

In fact, if the webserver is sending data back to a client (another client connection) while it is receiving data (firmware), the firmware will fail (it will try to read more data than available).

I hope this is more clear now :-)

I am thinking of two possibilities:

CurlyMoo commented 2 months ago

Does it fail on the md5 or also on the content length?

IgorYbema commented 2 months ago

It doesn't reach the md5 check as the 'return -1' causes the webserver to end abruptly. So it fails on the readlen > totallen

But I did check two wireshark dump on which one was a good transfer and one was a bad, failed, transfer. Both where exactly the same (only differnt port numbers ofcourse) so the client did sent the same data. It was processed differently by the webserver (in the case when it was also processing sending data to another client).

CurlyMoo commented 2 months ago

I managed to create a unittest that indeed triggers the named line. Now need to see why.

IgorYbema commented 2 months ago

Great job!

CurlyMoo commented 2 months ago

Hmm, i can make it to fail when both webserver clients share the same client_read function which mixes up the read pointer. However, when i give each client it's own read function it just works.

That make me wonder if somehow i made a mistake in the acceptance of new socket clients, mixing the up when multiple clients are connected.

CurlyMoo commented 2 months ago

I think this discussions is similar to our issue: https://github.com/arduino-libraries/WiFiNINA/issues/176

CurlyMoo commented 2 months ago

Maybe we need to switch to a different library instead of WifiClient as described here: https://github.com/espressif/arduino-esp32/blob/master/libraries/WiFi/examples/WiFiTelnetToSerial/WiFiTelnetToSerial.ino

CurlyMoo commented 2 months ago

I pushed an update to the websocket branch. I'm also curious if this bug also occurs when running the webserver in async mode.