khoih-prog / AsyncMQTT_Generic

Arduino Arduino Library for ESP8266, ESP32, Portenta_H7, STM32 and RP2040W asynchronous MQTT client implementation. This library, ported to support ESP32, WT32_ETH01 (ESP32 + LAN8720), ESP8266, Portenta_H7 (Ethernet or WiFi) and STM32 (LAN8742A or LAN8720 Ethernet), Teensy 4.1 using QNEthernet, RASPBERRY_PI_PICO_W with CYW43439 WiFi. Currently supporting TLS/SSL for ESP32 only
MIT License
64 stars 10 forks source link

SSL Connection to a MQTT broker triggers watchdog - WT32_ETH01 board #1

Closed cagabit closed 2 years ago

cagabit commented 2 years ago

Hi, I am not sure if its a bug or any mistake by me but thought this post will be usefull at the end.

Using: WT32_ETH01 board Arduino 1.8.19 AsyncMQTT_Generic v.1.4.0

Trying to connect to MQTT broker, myqtthub.com with SSL, using the /examples/WT32_ETH01/. Only changed these from the original example code:

#define MQTT_HOST "node02.myqtthub.com"

const uint8_t MQTT_SERVER_FINGERPRINT[] = {0x97, 0xE2, 0x30, 0xF9, 0xAE, 0x79, 0x16, 0xED, 0xDB, 0x2F, 0x9B, 0x78, 0xE5, 0xF2, 0x6D, 0x2C, 0x07, 0x6E, 0xCD, 0x2F};

Get the fingerprint of the server from this adress: (got from the last certificate page, at top, not sure if it is ok)

https://crt.sh/?q=node02.myqtthub.com

When run the debug shows:

Starting FullyFeatureSSL_WT32_ETH01 on WT32-ETH01 with ETH_PHY_LAN8720
WebServer_WT32_ETH01 v1.4.0 for core v1.0.6-
AsyncMQTT_Generic v1.4.0 for ESP32 core v1.0.6-
ETH Connected
ETH connected
IP address: 
192.168.1.145
Connecting to MQTT...
E (18585) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (18585) task_wdt:  - async_tcp_ssl (CPU 0/1)
E (18585) task_wdt: Tasks currently running:
E (18585) task_wdt: CPU 0: IDLE0
E (18585) task_wdt: CPU 1: IDLE1
E (18585) task_wdt: Aborting.
abort() was called at PC 0x401505b0 on core 0

ELF file SHA256: 0000000000000000

Backtrace: 0x40088708:0x3ffbf880 0x40088985:0x3ffbf8a0 0x401505b0:0x3ffbf8c0 0x40086f1d:0x3ffbf8e0 0x401654d3:0x3ffbc260 0x4015201f:0x3ffbc280 0x4008b145:0x3ffbc2a0 0x40089996:0x3ffbc2c0

Rebooting...
ets Jun  8 2016 00:22:57

When tried without a secure(SSL) connection it is working. Even if the fingerprint code is wrong, it should show that it can not connect, instead when trying to connect watchdog triggers.

I could not find any other info to publish here. If doing something wrong, any info will be usefull, thank you.

khoih-prog commented 2 years ago

Hi @cagabit

I've tried using the same server / fingerprint you posted, but there is no WDT triggers


1. The fingerprint is correct

Selection_134


2. Modified Code

#include <AsyncMqtt_Generic.h>

//#define MQTT_HOST         IPAddress(192, 168, 2, 110)
//#define MQTT_HOST         "broker.emqx.io"        // Broker address
#define MQTT_HOST "node02.myqtthub.com"

#if ASYNC_TCP_SSL_ENABLED

  #define MQTT_SECURE     true

  //const uint8_t MQTT_SERVER_FINGERPRINT[] = {0x7e, 0x36, 0x22, 0x01, 0xf9, 0x7e, 0x99, 0x2f, 0xc5, 0xdb, 0x3d, 0xbe, 0xac, 0x48, 0x67, 0x5b, 0x5d, 0x47, 0x94, 0xd2};
  const uint8_t MQTT_SERVER_FINGERPRINT[] = {0x97, 0xE2, 0x30, 0xF9, 0xAE, 0x79, 0x16, 0xED, 0xDB, 0x2F, 0x9B, 0x78, 0xE5, 0xF2, 0x6D, 0x2C, 0x07, 0x6E, 0xCD, 0x2F};

  const char *PubTopic  = "async-mqtt/WT32_ETH01_SSL_Pub";               // Topic to publish

3. Terminal

Using Board = WT32-ETH01 Ethernet Module
Starting FullyFeatureSSL_WT32_ETH01 on WT32-ETH01 with ETH_PHY_LAN8720
WebServer_WT32_ETH01 v1.4.1 for core v2.0.0+
AsyncMQTT_Generic v1.4.0 for ESP32 core v2.0.0+
ETH starting
ETH connected
ETH got IP
IP address: 192.168.2.128
Connecting to MQTT...
[AMQTT] CONNECTING
[AMQTT] ClientID = esp32-fc4b08fa48a8
[AMQTT] _onAck: ack len = 250
[AMQTT] _onAck: ack len = 143
[AMQTT] _onAck: ack len = 51
[AMQTT] TCP conn, MQTT CONNECT
[AMQTT] _addFront: new front, packetType = CONNECT
[AMQTT] _handleQueue: snd, packetType # CONNECT , tls: realSent = 61
[AMQTT] _handleQueue: sent / _headsize = 32 / 32
[AMQTT] _handleQueue: released packetType # CONNECT
[AMQTT] _onAck: ack len = 61
[AMQTT] _onData : data rcv len = 4
[AMQTT] _onData: rcv CONNACK
[AMQTT] CONNACK
[AMQTT] _onDisconnect: TCP disconn
Disconnected from MQTT.
Using Board = ESP32_DEV Module
Starting FullyFeatureSSL_WT32_ETH01 on WT32-ETH01 with ETH_PHY_LAN8720
WebServer_WT32_ETH01 v1.4.1 for core v2.0.0+
AsyncMQTT_Generic v1.4.0 for ESP32 core v2.0.0+
ETH starting
ETH connected
ETH got IP
IP address: 192.168.2.128
Connecting to MQTT...
[AMQTT] CONNECTING
[AMQTT] ClientID = esp32-fc4b08fa48a8
[AMQTT] _onAck: ack len = 250
[AMQTT] _onAck: ack len = 143
[AMQTT] _onAck: ack len = 51
[AMQTT] TCP conn, MQTT CONNECT
[AMQTT] _addFront: new front, packetType = CONNECT
[AMQTT] _handleQueue: snd, packetType # CONNECT , tls: realSent = 61
[AMQTT] _handleQueue: sent / _headsize = 32 / 32
[AMQTT] _handleQueue: released packetType # CONNECT
[AMQTT] _onAck: ack len = 61
[AMQTT] _onData : data rcv len = 4
[AMQTT] _onData: rcv CONNACK
[AMQTT] CONNACK
[AMQTT] _onDisconnect: TCP disconn
Disconnected from MQTT.

4. Things to try

Try using other board or add a delay(1) or yield() in the loop() to see if OK It's possible that your network is too slow or there some issue blocking / delaying data, etc.

void loop()
{
  yield();
  //delay(1);
}

I'm closing the issue now because this is not a proven bug of the library.

Good Luck,

khoih-prog commented 2 years ago

I suggest you to move up to

  1. ESP32 core v2.0.2
  2. WebServer_WT32_ETH01 v1.4.1

as I'm using

You're using very old core v1.0.6-

cagabit commented 2 years ago

Thank you both for sugestions.

Tried delaying "loop" without any difference, still watchdog triggers.

After upgrading to ESP32 core 2.0.2 and playing platformio.ini settings ( especially platform=...) it started to connect as your example post with WT32_ETH01 board but not everytime it tries.

There is still a questions that bothers me, please just advice for the correct direction because i am not sure which library is causing this: My network is slow this is certain and sometimes still triggering watchdog, my guess bacause of the slowness, lets say 1/3 sucess. 2/3 times or if you deliberatly change the fingerprint code with a false one the watchdog triggers. Is this because of the ethernet lib or mqtt lib which your lib uses or ? Ideally it should warn that the connection is not established , so that you can retry or else.

Can you please test your setup with a false fingerprint, maybe we can dublicate the error ?

Thanks

khoih-prog commented 2 years ago

After upgrading to ESP32 core 2.0.2 and playing platformio.ini settings ( especially platform=...) it started to connect as your example post with WT32_ETH01 board but not everytime it tries

sometimes still triggering watchdog

From not everytime, sometimes, it's possible your hardware (WT32_ETH01 board) has some erratic issue => use other new WT32_ETH01 boards to test

I certainly don't have time to deal with your issue unless you prove there is a bug in the library.

khoih-prog commented 2 years ago

Also try to use the latest ESP32 core v2.0.3-RC1 to see if it's better.

mehmetcanbalci commented 2 years ago

Hi, Thank you for your wonderful library I have the same issue, my connection is slow and delayed.

Connecting to MQTT... .[ATCP] _connected: error => closing Disconnected from MQTT. ...........E (111301) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time: E (111301) task_wdt: - async_tcp_ssl (CPU 0/1) E (111301) task_wdt: Tasks currently running: E (111301) task_wdt: CPU 0: IDLE E (111301) task_wdt: CPU 1: loopTask E (111301) task_wdt: Aborting.

Could you please tell me how can i stop wdt for async_tcp_ssl? I want to deal with timeout myself, ex: I can restart the mqtt part only because it reset the whole system. Thanks

khoih-prog commented 2 years ago

Hi @mehmetcanbalci

You can try in your code before

#include <AsyncMqtt_Generic.h>

such as

#define CONFIG_ASYNC_TCP_RUNNING_CORE     -1     //any available core
#define CONFIG_ASYNC_TCP_USE_WDT           0     //if enabled, adds between 33us and 200us per event

#include <AsyncMqtt_Generic.h>

But normally task watchdog is used for some reason, to prevent unexpected system-hanging / over-delaying issue.

It's better to check and verify your code, network, server, MQTT broker, etc.

Good Luck,