esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
293 stars 36 forks source link

'idle' Task watchdog triggered when ARDUINO_RUNNING_CORE and CONFIG_ARDUINO_RUNNING_CORE do not match #5317

Open nickolay opened 11 months ago

nickolay commented 11 months ago

The problem

Consider:

  1. In arduino-esp32 loopTask is pinned to ARDUINO_RUNNING_CORE
  2. esphome disables the 'idle' task watchdog based on CONFIG_ARDUINO_RUNNING_CORE, which is pre-set to 1 on ESP32-S3.
  3. ARDUINO_RUNNING_CORE is "configurable" independently from CONFIG_ARDUINO_RUNNING_CORE

In my case the lilygo-t5-47-plus board manifest overrides ARDUINO_RUNNING_CORE=0 causing a mismatch: arduino/esphome runs on core 0, while disabling the "IDLE task" watchdog on core 1.

This causes random crashes — in my project it was shortly on the first startup; when connecting to wi-fi) during yield() in Application::setup(), but depending on the specific components used the crash may disappear or render the board unusable crashing on startup.

Suggestion: change esphome to use ARDUINO_RUNNING_CORE instead.

Which version of ESPHome has the issue?

2023.12.5

What type of installation are you using?

pip

Which version of Home Assistant has the issue?

-

What platform are you using?

ESP32

Board

lilygo-t5-47-plus

Component causing the issue

esp32

Example YAML snippet

esphome:
  name: lilygo
  platformio_options:  # these (except for the RUNNING_CORE=0 defines)
                      # are needed because the specific board I have is
                      # not bundled with platformio
    upload_speed: 921600
    monitor_speed: 115200
    board_build.arduino.memory_type: qio_opi
    board_build.flash_size: 16MB
    build_flags:
      - "-DBOARD_HAS_PSRAM"
      - "-DARDUINO_RUNNING_CORE=0"  # NOTE: this conflicts with the value from the base board, spewing a lot of warnings during the build.)
      - "-DARDUINO_EVENT_RUNNING_CORE=0"  # and this too

esp32:
  variant: esp32s3
  board: esp32-s3-devkitc-1

  framework:
    type: arduino

logger:
  level: VERBOSE
  # hardware_uart: USB_CDC  # default on S3 since esphome 2023.12

# Enable Home Assistant API
api:
  password: !secret api_ota_password
ota:
  password: !secret api_ota_password
wifi: !include wifi-secrets.yaml

Anything in the logs that might be useful for us?

[03:38:23][ 10870][W][WiFiGeneric.cpp:955] _eventCallback(): Reason: 4 - ASSOC_EXPIRE
[03:38:23][W][wifi_esp32:458]: Event: Disconnected ssid='XXXXXX' bssid=XX:XX:XX:XX:XX:XX reason='Association Expired'
[03:38:27]E (15536) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
[03:38:27]E (15536) task_wdt:  - IDLE (CPU 0)
[03:38:27]E (15536) task_wdt: Tasks currently running:
[03:38:27]E (15536) task_wdt: CPU 0: loopTask
[03:38:27]E (15536) task_wdt: CPU 1: IDLE
[03:38:27]E (15536) task_wdt: Aborting.
----------- (restarted) -----------
[03:38:27]ESP-ROM:esp32s3-20210327

Additional information

core.dump.parsed.txt

ssieb commented 11 months ago

You've defined board: esp32-s3-devkitc-1, so why does it matter what is in the lilygo board definition that you're not using? Why do you need to override the working config with something that doesn't work? You don't need any of those build_flags:.

nickolay commented 11 months ago

Thanks for your prompt reply.

You've defined board: esp32-s3-devkitc-1 ... Why do you need to override the working config with something that doesn't work?

I have provided a self-contained example to demonstrate the underlying problem of esphome using the wrong #define.

You can instead download and use the board definition, which itself includes this ARDUINO_RUNNING_CORE=0 define (and other users are doing so, it's just more steps until the board definition is included in platformio https://github.com/platformio/platform-espressif32/issues/1269).

You can see I have asked the board vendor about the reason for this define https://github.com/Xinyuan-LilyGO/LilyGo-EPD47/issues/107 too and I'm indeed going to try to work around the problem by omitting it.

Still I don't see the reason for esphome to test CONFIG_ARDUINO_RUNNING_CORE instead of ARDUINO_RUNNING_CORE. Do you?

ssieb commented 11 months ago

I don't know the reason. Probably whoever wrote that didn't know the difference either. Why are there two different defines? Anyway, that really seems like a mistake in the board definition. It seems to me that it would cause huge performance issues. You could make an esphome PR to change it if you want and hopefully there's someone that knows more about that who can review it.

nickolay commented 11 months ago

Why are there two different defines?

The commit adding the ARDUINO_RUNNING_CORE doesn’t explain it very well, but I think that the CONFIG_ one is managed via Kconfig, which is out of reach for basic arduino users.

Anyway, that really seems like a mistake in the board definition.

It could be, but we might as well not punish the user of such a board with a wdt crash:)

You could make an esphome PR to change it

Will do, unless someone beats me to it.

Thanks again!

github-actions[bot] commented 7 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.