esphome / issues

Issue Tracker for ESPHome
https://esphome.io/
290 stars 34 forks source link

CAN Bus triggers watchdog reboot when nothing responds - no ACK #3317

Open Uksa007 opened 2 years ago

Uksa007 commented 2 years ago

The problem

Background info: 2 Devices total on the CAN bus, ESP32 with TJA1050 and Inverter. If the inverter is powered off, disconnected or not Acknowledging messages there are no other devices on the CAN bus to acknowledge the CAN messages.

The CAN Bus triggers a watchdog timer reboot loop, resulting in safe boot. This occurs when nothing ACKs the CAN bus messages, eg if the receiver is busy or goes offline. I think the CAN bus controller keeps retrying to send messages, I suspect when more messages are sent it takes too long and triggers the watchdog timer.

Is there an option to detect this and set the controller so it does not require an ACK, rather than a WDT reboot?? Or just not trigger a WDT reboot?

Further info on No Ack: No Ack Mode: The No Acknowledgement mode is similar to normal mode, however acknowledgements are not required for a message transmission to be considered successful. This mode is useful when self testing the TWAI controller (loopback of transmissions).

enumerator TWAI_MODE_NO_ACK[](https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/peripherals/twai.html#_CPPv4N11twai_mode_t16TWAI_MODE_NO_ACKE)
Transmission does not require acknowledgment. Use this mode for self testing

Which version of ESPHome has the issue?

2022.5.1

What type of installation are you using?

pip

Which version of Home Assistant has the issue?

2022.4.7

What platform are you using?

ESP32-IDF

Board

esp32doit-devkit-v1

Component causing the issue

can-bus

Example YAML snippet

# Example to demonstrate CAN bus WDT trigger and reboot loop 
esphome:
  name: test-esp32-send-can

esp32:
  board: esp32doit-devkit-v1
  framework:
    type: esp-idf
    version: latest

# Enable logging
logger:

# Enable Home Assistant API
api:

ota:

wifi:
  ssid: !secret wifi_ssid
  password: !secret wifi_password

canbus:
  - platform: esp32_can
    tx_pin: GPIO23
    rx_pin: GPIO22
    can_id: 4
    bit_rate: 500kbps

interval:
  - interval: 900ms
    then:
      - canbus.send:
          can_id: 0x359
          data: [0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00]
      - canbus.send:
          can_id: 0x351
          data: [0x28, 0x02, 0x86, 0x03, 0xE8, 0x03, 0x00, 0x00]
      - canbus.send:
          can_id: 0x355
          data: [0x1E, 0x00, 0x64, 0x00, 0x00, 0x00, 0x00, 0x00]
      - canbus.send:
          can_id: 0x356
          data: [0xEF, 0x14, 0x00, 0x00, 0x22, 0x01, 0x00, 0x00]
      - canbus.send:
          can_id: 0x354
          data: [0x04, 0xC0, 0x00, 0x1F, 0x03, 0x00, 0x00, 0x00]

Anything in the logs that might be useful for us?

[10:51:30][D][canbus:033]: send extended id=0x351 rtr=FALSE size=8
[10:51:31][D][canbus:033]: send extended id=0x355 rtr=FALSE size=8
[10:51:32][D][canbus:033]: send extended id=0x356 rtr=FALSE size=8
[10:51:33][D][canbus:033]: send extended id=0x354 rtr=FALSE size=8
[10:51:34]E (10809) task_wdt: Task watchdog got triggered. The following tasks did not reset the watchdog in time:
[10:51:34]E (10809) task_wdt:  - loopTask (CPU 1)
[10:51:34]E (10809) task_wdt: Tasks currently running:
[10:51:34]E (10809) task_wdt: CPU 0: IDLE
[10:51:34]E (10809) task_wdt: CPU 1: IDLE
[10:51:34]E (10809) task_wdt: Aborting.
[10:51:34]
[10:51:34]abort() was called at PC 0x400e4b4c on core 0
[10:51:34]
[10:51:34]Backtrace:0x40081bb6:0x3ffb0800 0x40088f5d:0x3ffb0820 0x4008f006:0x3ffb0840 0x400e4b4c:0x3ffb08b0 0x4008287d:0x3ffb08d0 0x40148b53:0x3ffbbad0 0x400e4dea:0x3ffbbaf0 0x4008a300:0x3ffbbb10
←[33mWARNING Found stack trace! Trying to decode it←[0m
←[33mWARNING Decoded 0x40081bb6: panic_abort at C:\Users\Paul\.platformio\packages\framework-espidf\components\esp_system/panic.c:404←[0m
←[33mWARNING Decoded 0x40088f5d: esp_system_abort at C:\Users\Paul\.platformio\packages\framework-espidf\components\esp_system/system_api.c:112←[0m
←[33mWARNING Decoded 0x4008f006: abort at C:\Users\Paul\.platformio\packages\framework-espidf\components\newlib/abort.c:46←[0m
←[33mWARNING Decoded 0x400e4b4c: task_wdt_isr at C:\Users\Paul\.platformio\packages\framework-espidf\components\esp_common\src/task_wdt.c:182 (discriminator 1)←[0m
←[33mWARNING Decoded 0x4008287d: _xt_lowint1 at C:\Users\Paul\.platformio\packages\framework-espidf\components\freertos\port\xtensa/xtensa_vectors.S:1105←[0m
←[33mWARNING Decoded 0x40148b53: cpu_ll_waiti at C:\Users\Paul\.platformio\packages\framework-espidf\components\hal\esp32\include/hal/cpu_ll.h:183
 (inlined by) esp_pm_impl_waiti at C:\Users\Paul\.platformio\packages\framework-espidf\components\esp_pm/pm_impl.c:827←[0m
←[33mWARNING Decoded 0x400e4dea: esp_vApplicationIdleHook at C:\Users\Paul\.platformio\packages\framework-espidf\components\esp_common\src/freertos_hooks.c:63←[0m
←[33mWARNING Decoded 0x4008a300: prvIdleTask at C:\Users\Paul\.platformio\packages\framework-espidf\components\freertos/tasks.c:3846←[0m
[10:51:34]
[10:51:34]
[10:51:34]ELF file SHA256: a32cf67929d8eb53
[10:51:34]
[10:51:34]Rebooting...

Additional information

No response

Uksa007 commented 2 years ago

Anyone have any thoughts or suggestions? I have set a delay between each canbus.send of 12ms but that didn't seems to help. Is there any way to clear the canbus tx messages that haven't been acknowledged?

ircoopr commented 2 years ago

Hello Uksa007 - with I could help but i'm actually here after finding this same issue running your fork of the esphome-jk-bms project! Thanks so much for all your work on that, I thought I was going to have to do this myself before I found your work!

I'll let you know if I do find a solution, I don't love the idea of the inverter going offline and losing comms with my BMS system as a result!

autox86 commented 1 year ago

@oxan @CarlosGS I have seen you have been involved for other watchdog related queries. https://github.com/esphome/esphome/pull/2846 https://github.com/esphome/esphome/pull/2535

Would you be able to help here to get that fixed as well? We would love to have CAN BUS running for some Battery Managment https://github.com/Uksa007/esphome-jk-bms-can

@nielsnl68 Is this issue related to your fix? https://github.com/esphome/issues/issues/3422#issuecomment-1247300527

Uksa007 commented 10 months ago

@oxan @CarlosGS @autox86 @nielsnl68

ili9341 is a display using SPI bus, this is CAN bus using the native ESP32 TWAI, doesn't seem related?

Any progress/how do we move this forward?