espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.72k stars 7.3k forks source link

Assert in TWAI interrupt handler twai_handle_tx_buffer_frame() (IDFGH-7616) #9169

Open zevv opened 2 years ago

zevv commented 2 years ago

Environment

chip ESP32-D0WD idf 4.3 custom board

Problem Description

Our application occasionaly crashes because of a failed assertion in twai_handle_tx_buffer_frame().

Expected Behavior

Program does not crash on assertions in interrupt handlers :)

Actual Behavior

During normal program operation the application gets aborted on an assertion with the below stack trace.

Steps to reproduce

We are not sure yet what triggers this situation, as it is hard to reproduce. If I find a way to reproduce I can try to provide a minimal application showing the behavior, but it is likely related to other traffic on the bus.

Debug Logs

0x40089484: panic_abort at /opt/toolchains/esp-idf/components/esp_system/panic.c:356
0x40089f5d: esp_system_abort at /opt/toolchains/esp-idf/components/esp_system/system_api.c:112
0x400904f9: abort at /opt/toolchains/esp-idf/components/newlib/abort.c:46
0x400832ef: lock_acquire_generic at /opt/toolchains/esp-idf/components/newlib/locks.c:138
0x400834fe: _lock_acquire_recursive at /opt/toolchains/esp-idf/components/newlib/locks.c:166
0x4010e99d: _vfiprintf_r at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/vfprintf.c:853 (discriminator 2)
0x401069e9: fiprintf at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdio/fiprintf.c:48
0x401068cd: __assert_func at /builds/idf/crosstool-NG/.build/xtensa-esp32-elf/src/newlib/newlib/libc/stdlib/assert.c:58 (discriminator 8)
0x40101351: twai_handle_tx_buffer_frame at /opt/toolchains/esp-idf/components/driver/twai.c:189
 (inlined by) twai_intr_handler_main at /opt/toolchains/esp-idf/components/driver/twai.c:232
0x400830e5: _xt_lowint1 at /opt/toolchains/esp-idf/components/freertos/port/xtensa/xtensa_vectors.S:1105
0x400f67ca: cpu_ll_waiti at /opt/toolchains/esp-idf/components/hal/esp32/include/hal/cpu_ll.h:183
 (inlined by) esp_pm_impl_waiti at /opt/toolchains/esp-idf/components/esp_pm/pm_impl.c:827
0x400d9df6: esp_vApplicationIdleHook at /opt/toolchains/esp-idf/components/esp_common/src/freertos_hooks.c:63
0x4008b803: prvIdleTask at /opt/toolchains/esp-idf/components/freertos/tasks.c:3839 (discriminator 1)
0x4008d6b2: vPortTaskWrapper at /opt/toolchains/esp-idf/components/freertos/port/xtensa/port.c:168
zevv commented 2 years ago

sdkconfig.txt

boborjan2 commented 2 years ago

It happens at our setup as well. Assert in twai.c line 189. It happens during busoff quite often. IDF 4.3, almost latest (dd62e3bb9bee4fed7f1a20922d6c0378fc45eb27). Our idea is that it happens when user calls twai_initiate_recovery() after busoff event while frames are awaiting in the tx queue. tx_msg_count gets zeroed in twai_initiate_recovery() and this causes the assert in the interrupt handler that checks if this same message count is zero. A way to solve it is to set object state in twai_initiate_recovery() BEFORE zeroing msg count:

esp_err_t twai_initiate_recovery(void)
{
    TWAI_ENTER_CRITICAL();
    //Check state
    TWAI_CHECK_FROM_CRIT(p_twai_obj != NULL, ESP_ERR_INVALID_STATE);
    TWAI_CHECK_FROM_CRIT(p_twai_obj->state == TWAI_STATE_BUS_OFF, ESP_ERR_INVALID_STATE);

    p_twai_obj->state = TWAI_STATE_RECOVERING;

    //Reset TX Queue/Counters
    if (p_twai_obj->tx_queue != NULL) {
        xQueueReset(p_twai_obj->tx_queue);
    }
    p_twai_obj->tx_msg_count = 0;

    //Trigger start of recovery process
    twai_hal_start_bus_recovery(&twai_context);
    TWAI_EXIT_CRITICAL();

    return ESP_OK;
}

And then check object stat in interrupt handler, in twai_handle_tx_buffer_frame() and simply return if in recovering.

Please confirm.

Thanks, Viktor