LennartHennigs / ESPTelnet

ESP library that allows you to setup a telnet server for debugging.
MIT License
215 stars 35 forks source link

ESP freezes when losing client connection #54

Closed Laxilef closed 10 months ago

Laxilef commented 10 months ago

Hi I noticed something strange. If a client disconnects from the telnet server suddenly, the ESP goes into an endless loop. At the same time, I don’t see the watchdog triggering and ESP reboots.

ESP comes to life if:

  1. forcibly disconnect the ESP client on the wifi router
  2. reconnect to the telnet server

It feels like there is somewhere stuck in an infinite loop with a call to yield() because the watchdog timer is not firing. This can be easily tested:

  1. We use ESPTelnet and Pubsubclient
  2. Every 5 seconds we write something in the mqtt topic and in telnet
  3. Connect to the telnet server
  4. Disable wifi on the telnet client without terminating the session in the telnet client
  5. We see that mqtt client on ESP has stopped publishing messages

I tested this on ESP8266 and ESP32, the behavior is the same. If I end the session in the telnet client, this does not occur.

It looks like the TCP connection is stuck. It appears that the TCP connection is waiting for a response from the client. But I didn’t see any bugs in your code, performKeepAliveCheck should disconnect the client, but something prevents it from doing this.

Do you have any ideas?

Laxilef commented 10 months ago

While researching I found out that freeze occurs after calling Stream::write(). This is due to a timeout in ClientContext::_write_from_source():

    size_t _write_from_source(const char* ds, const size_t dl)
    {
        assert(_datasource == nullptr);
        assert(!_send_waiting);
        _datasource = ds;
        _datalen = dl;
        _written = 0;
        _op_start_time = millis();
        do {
            if (_write_some()) {
                _op_start_time = millis();
            }

            if (_written == _datalen || _is_timeout() || state() == CLOSED) {
                if (_is_timeout()) {
                    DEBUGV(":wtmo\r\n");
                }
                _datasource = nullptr;
                _datalen = 0;
                break;
            }

            _send_waiting = true;
            // will resume on timeout or when _write_some_from_cb or _notify_error fires
            esp_delay(_timeout_ms, [this]() { return this->_send_waiting; });
            _send_waiting = false;
        } while(true);

        if (_sync)
            wait_until_acked();

        return _written;
    }

I'll check some things and add PR.