earlephilhower / arduino-pico

Raspberry Pi Pico Arduino core, for all RP2040 and RP2350 boards
GNU Lesser General Public License v2.1
1.98k stars 412 forks source link

Attempts to send larger chunks (>2920) of data over TCP fails #725

Closed oddstr13 closed 2 years ago

oddstr13 commented 2 years ago

I have not determined exactly where this limitation comes from, but it looks like it is lwip or the layers below, possibly the wifi chip itself.

2920 is 2*TCP_MSS (1460), which is the max amount of data getting sent in one TCP packet.

I hit this bug attempting to serve a 3851 B index file, where the headers where sent but no data arrived.

02:16:00.445 -> :wr 85 0
02:16:00.445 -> :wrc 85 85 0
02:16:00.445 -> :ref 2
02:16:00.445 -> :wr 2921 0
02:16:00.445 -> :wrc 2921 2921 -1
02:16:00.479 -> :ack 85
02:16:00.479 -> :wr 2921 0
02:16:00.479 -> :wrc 2921 2921 -1
02:16:05.483 -> :wr 2921 0
02:16:05.483 -> :wrc 2921 2921 -1
02:16:05.483 -> :wtmo
02:16:05.483 -> 0 bytes written.
02:16:05.483 -> :wr 930 0
02:16:05.483 -> :wrc 930 930 0
02:16:05.483 -> 930 bytes written.
02:16:05.517 -> :ack 930
02:16:05.517 -> :ur 2
    server.on("/", HTTP_GET, []() {
        size_t page_size = strlen(INDEX_PAGE);
        server.setContentLength(page_size);
        server.send(200, F("text/html"), emptyString);

        // WiFiClient does not know to chop up large socket writes,
        // and the chip on the PicoW doesn't seem to like larger
        // chunks than TCP_MSS(1460) * 2. Packet(s) are sent right away,
        // limited in size by TCP_MSS.
        /*
            01:25:37.570 -> :wr 85 0
            01:25:37.570 -> :wrc 85 85 0
            01:25:37.570 -> :ref 2
            01:25:37.570 -> :wr 2048 0
            01:25:37.570 -> :wrc 2048 2048 -1   // Error?
            01:25:37.604 -> :ack 85
            01:25:37.604 -> :wr 2048 0
            01:25:37.604 -> :wrc 2048 2048 0    // 2048 bytes written to WiFi
            01:25:37.637 -> :ack 1460           // TCP ACK
            01:25:37.637 -> :ack 588
            01:25:37.637 -> :wr 1803 0
            01:25:37.637 -> :wrc 1803 1803 0
            01:25:37.703 -> :ack 1460
            01:25:37.703 -> :ack 343
            01:25:37.703 -> :rcl pb=0 sz=-1
            01:25:37.703 -> :abort
        */
        WiFiClient client = server.client();
        const int chunksize = 2920;
        int chunks = ceil((double)page_size / chunksize);
        for (int i=0;i<chunks;i++) {
            int position = i*chunksize;
            int bytes_left = page_size - position;

            int written = client.write(&INDEX_PAGE[position], min(chunksize, bytes_left));
            Serial.printf("%i bytes written.", written);
            Serial.println();
            client.flush();
        }
    });
earlephilhower commented 2 years ago

What library are you using for the web server, BTW?

I think I can pull in the WiFiClient... into an MCVE. My guess is I've got an MTU mismatch somewhere. The LWIP stack should chop up the larger TCP packet into smaller ones of < MTU size, but having it send down 2K isn't going to work...

earlephilhower commented 2 years ago

This was not MSS/MTU-related, but it was because the MEM_SIZE buffer of outstanding packets couldn't fit your transmit size. Simple tweak to allow ClientContext to be smarter about retry under memory pressure, and adding a small amount of RAM to the LWIP preallocated space to help w/throughput.

oddstr13 commented 2 years ago

I'm using https://github.com/khoih-prog/WiFiWebServer mostly because the code was originally written for ESP8266, and it was a drop-in replacement. Just have to set #define USE_WIFI_NINA false before including, as it isn't aware of the PicoW and it's native WiFi yet.

The TCP packets themselves are getting chopped up, as you can see from the :ack debug prints in the code block comment. It is however failing to write data larger than 2* the data part of the MTU (TCP_MSS). It looks like there's an internal buffer somewhere that can't fit more than two packets worth of data, but the code feeding data to that part is not aware of this, and doesn't know to slice up the data into smaller pieces.

I was trying to mess a bit with lwipopts.h to see if any of those settings changed the behavior, but I think Arduino's automatic compile cache got in my way.

PS: Still messing around a bit, I see that you've merged a fix in the mean time. Simply increasing MEM_SIZE does nothing (I've tried setting it to 20k).

While decreasing the size of the attempted tcp_write if it fails will likely work around the problem by re-attempting, I don't think it is a proper fix.

I'll keep poking around some, and report back if I find something of interest

earlephilhower commented 2 years ago

PS: Still messing around a bit, I see that you've merged a fix in the mean time. Simply increasing MEM_SIZE does nothing (I've tried setting it to 20k).

The thing that probably got you there was that you need to rebuild libpico.a with the new defines. Changing the #define w/o a rebuild won't actually have any effect. It's done this way because right now Pico-SDK includes LWIP as part of its build process so it's just simpler. Changing the MEM_SIZE to 20K and rebuilding ./tools/libpico/makelibpico.sh did work, too.

The fix increases the MEM_SIZE in the blob as well as allows the tcp_write to make progress. So before if you tried to client.send(40k) it would never send anything unless MEM_SIZE was > 40K+(some amount for other stuff). Now, it will try sending a chunk of 40K, then 20K, then 10K, then 5K before giving up, for example.

The root is that we need a static assignment of memory since the wifi driver can run from IRQs and call in to pbuf stuff (and the periodic check-timers, too). In other implementations (ESP8266 for ex.) LWIP can do a malloc() to grab add'l memory, but you need to pump the LWIP and WIFI stack manually in your app or it won't do anything at all.

I tested the fix by sending the first ~8 paragraphs of Moby Dick on a connect() from the server, ~13K or so IIRC. You can see it try and send 13K, then progressively smaller amounts to try and make fwd. progress. There are limits, yes (i.e. if your RTT is many 100s of ms), but I don't think practically they're a problem for the majority of use cases on the little Pico.

oddstr13 commented 2 years ago

Aha, that explains why I can't get debug enabled. My system only has cmake 3.10, so I'm not able to get it compiled without finding a way of upgrading that.

In the example lwipopts.h, MEM_SIZE is 10k, and TCP_SND_BUF is 2k (2*MSS). Currently, TCP_SND_BUF is 11680, but MEM_SIZE is only 4000 (now 8k after your fix).

It looks like MEM_SIZE is supposed to be significantly larger than TCP_SND_BUF, and that things are likely to work just fine if it is?

https://github.com/earlephilhower/arduino-pico/blob/434c704a4b9f0cea5449c88a48f6f94ee9623593/libraries/WiFi/src/include/ClientContext.h#L456-L477 esp_delay() here would exit when a ACK was is received for data in the send buffer, and allow another chunk to be written right away.

https://github.com/earlephilhower/arduino-pico/blob/434c704a4b9f0cea5449c88a48f6f94ee9623593/libraries/WiFi/src/include/ClientContext.h#L504

It won't attempt to allocate anything larger than TCP_SND_BUF, even if the data written is larger. https://github.com/earlephilhower/arduino-pico/blob/434c704a4b9f0cea5449c88a48f6f94ee9623593/libraries/WiFi/src/include/ClientContext.h#L532

I am no closer to figuring out why the threshold where things stop working is 2MSS, but I think that exact number is suspicious. It could be related to 3MSS (or MTU) being larger than the allotted memory I guess.

I'd propose reducing TCP_SND_BUF well below MEM_SIZE, as we can't possibly fit anything bigger than that in there. Below half would likely be desirable.

earlephilhower commented 2 years ago

I'd propose reducing TCP_SND_BUF well below MEM_SIZE, as we can't possibly fit anything bigger than that in there. Below half would likely be desirable.

Sounds good. Reopening until I get that merge in.

earlephilhower commented 2 years ago

@oddstr13 the bug auto-closes with the merge, but if you'd like to give the latest master head a try it'd be appreciated. If you do find some more problems, you can reopen here or make a new one. Thx!

oddstr13 commented 2 years ago

I didn't pull in all of master, only the two relevant files (grabbed an updated copy of cmake).

Looks working to me (the http server didn't like my larger test string, but that's not relevant here)

Maybe a little sub-optimal, as it doesn't manage to add more data to the buffer before it empties, but that could be hard to fix and doesn't really matter too much.

Thanks for your swift responses! :slightly_smiling_face:

15:14:27.715 -> [WIFI] _prepareHeader sendHeader Conn close
15:14:27.715 -> :wr 86 0
15:14:27.715 -> :wrc 86 86 0
15:14:27.715 -> :ref 2
15:14:27.715 -> :wr 30815 0
15:14:27.715 -> :wrc 11594 30815 0
15:14:27.748 -> :ack 86
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :ack 1460
15:14:27.748 -> :wr 19221 11594
15:14:27.748 -> :ack 1374
15:14:27.748 -> :wrc 10306 19221 0
15:14:27.748 -> :wrc 1374 8915 0
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.781 -> :ack 1460
15:14:27.822 -> :wr 7541 23274
15:14:27.822 -> WS:ac
15:14:27.822 -> :rn 285
15:14:27.822 -> :wrc 7541 7541 0
15:14:27.822 -> :ur 2
15:14:27.822 -> [WIFI] _handleRequest OK
15:14:27.822 -> [WIFI] _handleRequest: _finalizeResponse
15:14:27.847 -> :ack 2920
15:14:27.847 -> :ack 2920
15:14:27.847 -> :ack 2920
15:14:27.847 -> :ack 241
15:14:27.847 -> :rcl pb=0 sz=-1
15:14:27.847 -> :abort