esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
290 stars 36 forks source link

ESP32-IDF hang (watchdog triggered) on shutdown with MQTT client config #3922

Open gwilford opened 1 year ago

gwilford commented 1 year ago

The problem

Using esp-idf framework on ESP32 to support MQTT TLS connection. ESP32 hangs on shutdown after OTA update due to WDT timeout when MQTT client is included in running config. Requires power cycle to recover device:

E (83800) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time: E (83800) task_wdt: - loopTask (CPU 1) E (83800) task_wdt: Tasks currently running: E (83800) task_wdt: CPU 0: IDLE E (83800) task_wdt: CPU 1: IDLE E (83800) task_wdt: Print CPU 0 (current core) backtrace

Same result with two different ESP32 board configurations.

Which version of ESPHome has the issue?

2022.12.0

What type of installation are you using?

Home Assistant Add-on

Which version of Home Assistant has the issue?

2022.12.5

What platform are you using?

ESP32-IDF

Board

generic 30pin devkit v1

Component causing the issue

mqtt

Example YAML snippet

esphome:
  name: esp32-scratch2

esp32:
  board: esp32dev
  framework:
    type: esp-idf

# Enable logging
logger:
  level: VERBOSE

ota:
  password: "[redacted]"

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

mqtt:
  broker: [redacted]
  port: 8883
  username: !secret mqtt_user
  password: !secret mqtt_password
#  shutdown_message:
  # Evaluate carefully skip_cert_cn_check
  skip_cert_cn_check: false
  idf_send_async: false
  certificate_authority: |
    -----BEGIN CERTIFICATE-----
    [redacted]
    -----END CERTIFICATE-----

Anything in the logs that might be useful for us?

[C][wifi:369]:   Signal strength: -53 dB ▂▄▆█
[V][wifi:371]:   Priority: 0.0
[C][wifi:373]:   Channel: 6
[C][wifi:374]:   Subnet: 255.255.255.0
[C][wifi:375]:   Gateway: 192.168.1.1
[C][wifi:376]:   DNS1: 192.168.1.22
[C][wifi:377]:   DNS2: 192.168.1.22
[C][ota:093]: Over-The-Air Updates:
[C][ota:094]:   Address: esp32-scratch2.local:3232
[C][ota:097]:   Using Password.
[C][mqtt:029]: Setting up MQTT...
[D][mqtt:117]: Resolving MQTT broker IP address...
[D][mqtt:150]: Resolved broker IP address to [redacted]
[I][mqtt:176]: Connecting to MQTT...
[V][mqtt.idf:073]: Event dispatched from event loop event_id=7
[V][mqtt.idf:076]: MQTT_EVENT_BEFORE_CONNECT
[V][mqtt.idf:073]: Event dispatched from event loop event_id=1
[V][mqtt.idf:080]: MQTT_EVENT_CONNECTED
[I][mqtt:216]: MQTT Connected!
[I][app:062]: setup() finished successfully!
[V][mqtt:415]: Publish(topic='esp32-scratch2/status' payload='online' retain=1)
[I][app:102]: ESPHome version 2022.12.0 compiled on Dec 15 2022, 09:43:00
[C][wifi:504]: WiFi:
[C][wifi:362]:   Local MAC: 78:21:84:7E:4C:50
[C][wifi:363]:   SSID: [redacted]
[C][wifi:364]:   IP Address: 192.168.1.54
[C][wifi:366]:   BSSID: [redacted]
[C][wifi:367]:   Hostname: 'esp32-scratch2'
[C][wifi:369]:   Signal strength: -50 dB ▂▄▆█
[V][wifi:371]:   Priority: 0.0
[C][wifi:373]:   Channel: 6
[C][wifi:374]:   Subnet: 255.255.255.0
[C][wifi:375]:   Gateway: 192.168.1.1
[C][wifi:376]:   DNS1: 192.168.1.22
[C][wifi:377]:   DNS2: 192.168.1.22
[C][logger:293]: Logger:
[C][logger:294]:   Level: VERBOSE
[C][logger:295]:   Log Baud Rate: 115200
[C][logger:296]:   Hardware UART: UART0
[C][mdns:103]: mDNS:
[C][mdns:104]:   Hostname: esp32-scratch2
[V][mdns:105]:   Services:
[V][mdns:107]:   - _http, _tcp, 80
[V][mdns:109]:     TXT: version = 2022.12.0
[C][ota:093]: Over-The-Air Updates:
[C][ota:094]:   Address: esp32-scratch2.local:3232
[C][ota:097]:   Using Password.
[C][mqtt:065]: MQTT:
[C][mqtt:067]:   Server Address: [redacted]:8883 ([redacted])
[C][mqtt:068]:   Username: 'mqtt'[redacted]
[C][mqtt:069]:   Client ID: 'esp32-scratch2-7821847e4c50'[redacted]
[C][mqtt:071]:   Discovery prefix: 'homeassistant'
[C][mqtt:072]:   Discovery retain: YES
[C][mqtt:074]:   Topic Prefix: 'esp32-scratch2'
[C][mqtt:076]:   Log Topic: 'esp32-scratch2/debug'
[C][mqtt:079]:   Availability: 'esp32-scratch2/status'
[V][wifi_esp32:651]: Event: Got IPv6 address=[redacted]
[V][esp-idf:000]: W (17059) wifi:
[V][esp-idf:000]: <ba-add>idx:1 (ifx:0, cc:2d:21:46:20:e1), tid:6, ssn:0, winSize:64
[V][esp-idf:000]: 

[D][ota:147]: Starting OTA Update from ::FFFF:192.168.1.22...
[V][ota:178]: OTA features is 0x01
[V][ota:198]: Auth: Nonce is fbc6938b3b0229f385524946e58a546b
[V][ota:218]: Auth: CNonce is 5a2c3c1d8c6b64642ef9f8843395cf72
[V][ota:225]: Auth: Result is 63624317003b8d47b68e7cdf75e10fc2
[V][ota:233]: Auth: Response is 63624317003b8d47b68e7cdf75e10fc2
[V][ota:261]: OTA size is 820704 bytes
[V][ota:278]: Update: Binary MD5 is b9900b66956243024ed0bee3ba565ad8
[D][ota:316]: OTA in progress: 0.1%
[D][ota:316]: OTA in progress: 19.2%
[D][ota:316]: OTA in progress: 39.5%
[D][ota:316]: OTA in progress: 50.5%
[D][ota:316]: OTA in progress: 73.1%
[V][esp-idf:000]: I (71486) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=1ebd4h (125908) map

[V][esp-idf:000]: I (71534) esp_image: segment 1: paddr=0002ebfc vaddr=3ffb0000 size=0141ch (  5148) 

[V][esp-idf:000]: I (71544) esp_image: segment 2: paddr=00030020 vaddr=400d0020 size=92394h (598932) map

[V][esp-idf:000]: I (71750) esp_image: segment 3: paddr=000c23bc vaddr=3ffb141c size=02218h (  8728) 

[V][esp-idf:000]: I (71761) esp_image: segment 4: paddr=000c45dc vaddr=40080000 size=13fc0h ( 81856) 

[V][esp-idf:000]: I (71795) esp_image: segment 5: paddr=000d85a4 vaddr=50000000 size=00010h (    16) 

[V][esp-idf:000]: I (71802) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=1ebd4h (125908) map

[V][esp-idf:000]: I (71855) esp_image: segment 1: paddr=0002ebfc vaddr=3ffb0000 size=0141ch (  5148) 

[V][esp-idf:000]: I (71863) esp_image: segment 2: paddr=00030020 vaddr=400d0020 size=92394h (598932) map

[V][esp-idf:000]: I (72069) esp_image: segment 3: paddr=000c23bc vaddr=3ffb141c size=02218h (  8728) 

[V][esp-idf:000]: I (72079) esp_image: segment 4: paddr=000c45dc vaddr=40080000 size=13fc0h ( 81856) 

[V][esp-idf:000]: I (72115) esp_image: segment 5: paddr=000d85a4 vaddr=50000000 size=00010h (    16) 

[I][ota:349]: OTA update finished!
[I][app:134]: Rebooting safely...
[D][esp32.preferences:113]: Saving 1 preferences to flash...
[V][esp32.preferences:125]: sync: key: 233825507, len: 4
[D][esp32.preferences:142]: Saving 1 preferences to flash: 0 cached, 1 written, 0 failed
[V][mqtt:415]: Publish(topic='esp32-scratch2/status' payload='offline' retain=1)
[V][esp-idf:000]: I (72347) MQTT_CLIENT: Client asked to disconnect

[V][esp-idf:000]: I (72351) wifi:
[V][esp-idf:000]: state: run -> init (0)
[V][esp-idf:000]: 

[V][esp-idf:000]: I (72359) wifi:
[V][esp-idf:000]: pm stop, total sleep time: 46888446 us / 65682525 us

[V][esp-idf:000]: 

[V][esp-idf:000]: W (72369) wifi:
[V][esp-idf:000]: <ba-del>idx
[V][esp-idf:000]: 

[V][esp-idf:000]: W (72379) wifi:
[V][esp-idf:000]: <ba-del>idx
[V][esp-idf:000]: 

[V][esp-idf:000]: I (72389) wifi:
[V][esp-idf:000]: new:<6,0>, old:<6,0>, ap:<255,255>, sta:<6,0>, prof:1
[V][esp-idf:000]: 

[V][esp-idf:000]: W (72403) wifi:
E (77197) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (77197) task_wdt:  - loopTask (CPU 1)
E (77197) task_wdt: Tasks currently running:
E (77197) task_wdt: CPU 0: IDLE
E (77197) task_wdt: CPU 1: IDLE
E (77197) task_wdt: Print CPU 0 (current core) backtrace

Backtrace:0x400E92E7:0x3FFB08A00x40082DE1:0x3FFB08C0 0x4015F507:0x3FFBB100 0x400E98F5:0x3FFBB120 0x40089E4C:0x3FFBB140 

E (77197) task_wdt: Print CPU 1 backtrace

Backtrace:0x4008237D:0x3FFB0EA00x40082DE1:0x3FFB0EC0 0x4015F507:0x3FFBB860 0x400E98F5:0x3FFBB880 0x40089E4C:0x3FFBB8A0 

E (82197) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
E (82197) task_wdt:  - loopTask (CPU 1)
E (82197) task_wdt: Tasks currently running:
E (82197) task_wdt: CPU 0: IDLE
E (82197) task_wdt: CPU 1: IDLE
E (82197) task_wdt: Print CPU 0 (current core) backtrace

Backtrace:0x400E92E7:0x3FFB08A00x40082DE1:0x3FFB08C0 0x4015F507:0x3FFBB100 0x400E98F5:0x3FFBB120 0x40089E4C:0x3FFBB140

Additional information

No response

gwilford commented 1 year ago

Further testing gives same result with non-TLS MQTT config:

mqtt:
   broker: !secret mqtt_host
   port: 1883
   username: !secret mqtt_user
   password: !secret mqtt_password

Forcing an esp_restart() in esphome -> on_shutdown doesn't help as the hang appears to occur within esp_restart().

After some research, I wonder if this may be related to a wifi shutdown_handler hanging in the esp-idf framework. In turn, it could be waiting on some MQTT connections/threads which remain at shutdown.

PaleoW commented 1 year ago

Hi,

I can confirme the problem above. With mqtt: enabled in my config, every OTA update requires a power cycle.

ESP32-IDF . tested on several m5stack-atom.

github-actions[bot] commented 1 year ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.