ssilverman / QNEthernet

An lwIP-based Ethernet library for Teensy 4.1 and possibly some other platforms
GNU Affero General Public License v3.0
81 stars 24 forks source link

Large TCP packet is broken #61

Closed mateusz-kusmierz closed 8 months ago

mateusz-kusmierz commented 9 months ago

Hello. I'm running below test code. It does produce unexpected result on my setup. I know it might not be QNEthernet fault but I have nowere to go for help.

#if !(defined(CORE_TEENSY) && defined(__IMXRT1062__) && defined(ARDUINO_TEENSY41))
#error Only Teensy 4.1 supported
#endif

#define BUFFER_SIZE 32000
#include "QNEthernet.h"  // https://github.com/ssilverman/QNEthernet
using namespace qindesign::network;
#include <AsyncWebServer_Teensy41.h>

AsyncWebServer server(80);

void handleRoot(AsyncWebServerRequest *request) {
  char temp[BUFFER_SIZE];
  memset(temp, 0, sizeof(temp));
  strcat(temp, "<html><body>");

  for (int i = 0; i < 11559; i++) {
    strcat(temp, "0");
  }
  strcat(temp, "</body></html>");

  request->send(200, "text/html", temp);
  Serial.println(strlen(temp));
}

void setup() {
  Serial.begin(115200);

  Ethernet.begin();
  if (!Ethernet.waitForLocalIP(5000)) {
    Serial.println(F("Failed to configure Ethernet"));
    if (!Ethernet.linkStatus()) {
      Serial.println(F("Ethernet cable is not connected."));
    }
    // Stay here forever
    while (true) {
      delay(1);
    }

  } else {
    Serial.print(F("Connected! IP address:"));
    Serial.println(Ethernet.localIP());
  }

  delay(1000);
  server.on("/", HTTP_GET, [](AsyncWebServerRequest *request) {
    handleRoot(request);
  });

  server.begin();
  Serial.print(F("HTTP EthernetWebServer is @ IP : "));
  Serial.println(Ethernet.localIP());
}

void loop() {
}

It should print out only zeros on html page. But instead it looks like that:

image

It seems like it is printing some random garbage in the middle, if payload length is higher or equal 11585 bytes.

Looks like this payload is split into 3 TCP packets, but on the third packet, there are unexpected bytes at the beginning of the payload: image

Can you help me, or know how to track down or solve this issue? Thanks.

ssilverman commented 9 months ago

Hi. Thanks for the question. Just acknowledging that I see it and I’ll get to it when I have a chance.

ssilverman commented 9 months ago

That library (AsyncWebServer_Teensy41) doesn't really use QNEthernet. It does use the initialization machinery, but then uses the included lwIP stack and not the QNEthernet API. Without spending more time, it's hard to follow what's going on.

Is your content being sent in a single packet that's greater than the TCP MSS size, requiring IP-level fragmentation and reassembly? Could you try this experiment: Set IP_FRAG to 0 in lwipopts.h in your QNEthernet install.

mateusz-kusmierz commented 9 months ago

Hi. I have tried that but is seems that there is no difference with IP_FRAG being set to 0.

mateusz-kusmierz commented 9 months ago

I have added Serial.print for debugging in Teensy41_AsyncTCP_Impl.h with packet size and first 6 bytes, just before it is sent to tcp_write(). I'm not sure if tcp_write() that is used here, is directly pulled from QNEthernet or not. image And the console output:

5840    5840    48 54 54 50 2F 31
5840    5840    30 30 30 30 30 30
5840    12  62 6F 64 79 3E 3C

So it looks like tcp_write is asked to write 3 packets but in wireshark I can see 4 packets (first screenshot in the thread was badly marked): image

Output on webpage still contains weird characters in the middle: image

ssilverman commented 9 months ago

To reiterate: Teensy41_AsyncTCP doesn’t use the QNEthernet library; it only uses the initialization machinery and the included lwIP stack code. tcp_write() is an lwIP call. (See lwIP here: https://savannah.nongnu.org/projects/lwip/)

Have a look at the QNEthernetClient.cpp and internal/ConnectionManager.cpp files for how I use lwIP. I believe there’s also some notes at the top of lwIP’s tcp.c file for how to use those TCP functions. That’s where I started.

As another option, I also do consulting if you want to reach out to me privately. It’s just that I’ve spent no time with those lwIP-based libraries (Teensy41_AsyncTCP and the web server one), so I’m not familiar with how they work.

ssilverman commented 9 months ago

I’ll add: I don’t believe there’s an advantage to using those so-called “async” libraries because the way I’ve configured lwIP is to not use interrupts nor threading nor concurrency, and since they use my configuration, neither of them use it either.

Under the covers, whenever my Ethernet.loop() function is called — often from inside the automatically-called system yield() function, and sometimes in other places internal to the library — is when data is “handled”. Now, I don’t think those libraries are using my lwIP callback functions, which probably means they’ve installed their own, and those are probably also called whenever the system yield() is called.

In short, I don’t believe there’s a real speed advantage to using them, as long as the main program is written well.

mateusz-kusmierz commented 8 months ago

It looks like You are right!

I think I solved it, but I'll never know for sure... There is a flag nonCopyingSend in this async library that is normally set to true. If I set it to false during send request->send(200, "text/html", temp, false); the issue does not appear.

What library do You recommend to handle webpage + responses?

I will test more and then close this issue. Thanks for now!

ssilverman commented 8 months ago

I’m glad you found a potential cause! :)

I usually use my own code for a webserver (I’ve been writing them for years.) That, of course, doesn’t help those that don’t want to write their own. One of these days, I might consider writing a rudimentary web server example.

mateusz-kusmierz commented 8 months ago

That would be great!