ssilverman / QNEthernet

An lwIP-based Ethernet library for Teensy 4.1 and possibly some other platforms
GNU Affero General Public License v3.0
86 stars 24 forks source link

Possible bug in EthernetClient::connected() and operator bool() #40

Closed ssilverman closed 1 year ago

ssilverman commented 1 year ago

Copied from here: https://forum.pjrc.com/threads/72883-Possible-bug-in-EthernetClient-connected()-and-operator-bool() User: dgnuff


This code reproduces the problem:

#include <QNEthernet.h>

using namespace qindesign::network;

EthernetClient client;

bool connected = false;

void PrintString(char const *s)
{
    Serial.print(s);
}

// the setup routine runs once when you press reset:
void setup()
{
    Serial.print("Starting ...");
    Ethernet.begin();
    IPAddress ipAddress;
    for (int i = 0; i < 300; i++)
    {
        ipAddress = Ethernet.localIP();
        if (ipAddress)
        {
            Serial.println(".");
            break;
        }
        Serial.print(".");
        delay(100);
    }
    Serial.printf("IP Address: %d.%d.%d.%d\r\n", ipAddress[0], ipAddress[1], ipAddress[2], ipAddress[3]);

    uint8_t mac[6];
    Ethernet.macAddress(mac);
    Serial.printf("Mac Address: %02x:%02x:%02x:%02x:%02x:%02x\r\n", mac[0], mac[1], mac[2], mac[3], mac[4], mac[5]);

    ipAddress = IPAddress(52, 250, 42, 157);  // IP Address of duckduckgo.com: 52.250.42.157

    client.connectNoWait(ipAddress, 443);   // Connect to SSL port.
}

// the loop routine runs over and over again forever:
void loop()
{
    if (!connected && client.connected())
    {
        connected = true;
        Serial.print("Connection has been established\r\n");
    }
    delay(1);
}

I'm trying to use EthernetClient::connectNoWait() to establish connections non-blocking. It seemed reasonable that EthernetClient::connected() would be a way to poll for completion of the connection. Except EthernetClient::connected() permanently returns false, even when use of WireShark clearly shows the SYN/SYN-ACK/ACK flow on the wire. And when connecting to my PC on the LAN, a test server application registers that a connection has been accepted. EthernetClient::operator bool() suffers the same problem, even though its logic is slightly different.

So I did a bit of digging into EthernetClient::connected(), and added a debug line to it as follows:

uint8_t EthernetClient::connected() {
  if (conn_ == nullptr) {
    return false;
  }
  if (!conn_->connected && conn_->remaining.empty()) {
    PrintString("EthernetClient::connected() setting conn_ to nullptr\r\n");  // This line added
    conn_ = nullptr;
    return false;
  }
  Ethernet.loop();  // Allow information to come in
  return true;
}

As expected, the debug line was executed, indicating that conn_->remaining is indeed empty. Hardly a surprise, it appears that the remaining vector is part of the read logic, and I'd expect it to be empty if the connection hasn't yet been established.

Modifying EthernetClient::connected() as follows causes the test program above and my main application to start working at lease as far as EthernetClient::connectNoWait() is concerned, but that's a change that I can't make for the long term, because 40+ years of programming experience tells me that doing so will almost certainly break something else.

uint8_t EthernetClient::connected() {
  if (conn_ == nullptr || !conn_->connected) {
    return false;
  }
  //if (!conn_->connected && conn_->remaining.empty()) {
  //  SPS("EthernetClient::connected() setting conn_ to nullptr\r\n");
  //  conn_ = nullptr;
  //  return false;
  //}
  Ethernet.loop();  // Allow information to come in
  return true;
}

The solution I'd use is to change the first line so that instead of just blindly testing !conn_->connected, that test is also guarded by something that determines we don't yet have a connection because the initial three-way handshake hasn't completed. Only if that condition is true would we check for !conn_->connected. I just don't know how to do that. Once that change is made, the commented out logic can be restored, and the routine should then continue to work correctly.

ssilverman commented 1 year ago

Fixed in 7b3ecbc2.

dgnuff commented 1 year ago

Thanks for fixing this. I merged the changes from https://github.com/ssilverman/QNEthernet/commit/7b3ecbc2b9ac4415e7b68f5c7898f2fd55519b78 into the library on my system here, and it's now working correctly.