espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.87k stars 7.32k forks source link

Guru Meditation Error: Core 0 panic'ed (Interrupt wdt timeout on CPU0) with modbus and autosleep (IDFGH-13961) #14792

Open juantxorena opened 1 month ago

juantxorena commented 1 month ago

Answers checklist.

IDF version.

v5.3.1

Espressif SoC revision.

esp32-c6

Operating System used.

Linux

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

None

Development Kit.

esp32-c6-devkitm-1

Power Supply used.

USB

What is the expected behavior?

When using modbus with uart and automatic sleep enabled, it should work without problems.

What is the actual behavior?

It breaks occasionally with the below errors and restarts.

Steps to reproduce.

This code configures the automatic sleep mode

#if CONFIG_PM_ENABLE
    // Configure dynamic frequency scaling:
    // maximum and minimum frequencies are set in sdkconfig,
    // automatic light sleep is enabled if tickless idle support is enabled.
    esp_pm_config_t pm_config = {
        .max_freq_mhz = CONFIG_EXAMPLE_MAX_CPU_FREQ_MHZ,
        .min_freq_mhz = CONFIG_EXAMPLE_MIN_CPU_FREQ_MHZ,
#if CONFIG_FREERTOS_USE_TICKLESS_IDLE
        .light_sleep_enable = true
#endif
    };
    ESP_ERROR_CHECK(esp_pm_configure(&pm_config));
#endif // CONFIG_PM_ENABLE

The following code is the initialization of modbus, using UART 1:

       mb_communication_info_t comm_info = {
        .port = UART_PORT_NUM,    // UART port (adjust to your configuration)
        .mode = MB_MODE_RTU,      // Modbus mode (RTU for RS232)
        .baudrate = UART_BAUD_RATE,         // Baudrate
        .parity = MB_PARITY_NONE  // No parity
    };
    void* master_handler = NULL;
    esp_err_t err = mbc_master_init(MB_PORT_SERIAL_MASTER, &master_handler);

    MB_RETURN_ON_FALSE((master_handler != NULL), ESP_ERR_INVALID_STATE, TAG,
                                "mb controller initialization fail.");
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
                            "mb controller initialization fail, returns(0x%x).", (int)err);
    err = mbc_master_setup((void*)&comm_info);
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
                            "mb controller setup fail, returns(0x%x).", (int)err);

    err = uart_set_pin(UART_PORT_NUM, UART_TX_PIN, UART_RX_PIN,
                              UART_PIN_NO_CHANGE, UART_PIN_NO_CHANGE);
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
        "mb serial set pin failure, uart_set_pin() returned (0x%x).", (int)err);
    err = mbc_master_start();
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
                            "mb controller start fail, returned (0x%x).", (int)err);

    err = uart_set_mode(UART_PORT_NUM, UART_MODE_UART);
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
            "mb serial set mode failure, uart_set_mode() returned (0x%x).", (int)err);

    vTaskDelay(5);
    MB_RETURN_ON_FALSE((err == ESP_OK), ESP_ERR_INVALID_STATE, TAG,
                                "mb controller set descriptor fail, returns(0x%x).", (int)err);
    ESP_LOGI(TAG, "Modbus master stack initialized...");
    esp_sleep_enable_uart_wakeup(UART_PORT_NUM);

    vTaskDelay(10);

I'm reading data like this:

    uint16_t data[17];  // Array to store 17 registers (2 bytes each)
    esp_err_t err;

    mb_param_request_t param_request = {
        .slave_addr = MB_DEVICE_ADDR,
        .command = 0x03,
        .reg_start = 0x000A,
        .reg_size = 17 
    };

    while ((err = mbc_master_send_request(&param_request, (uint8_t*)data)) != ESP_OK) {
        ESP_LOGE(TAG, "Error reading Modbus registers: %s", esp_err_to_name(err));
        vTaskDelay(pdMS_TO_TICKS(controller_backoff_delay));
        controller_backoff_delay *= 2;
        if (controller_backoff_delay > CONTROLLER_MAX_BACKOFF_DELAY) {
            controller_backoff_delay = CONTROLLER_MAX_BACKOFF_DELAY;
        }
    }
    controller_backoff_delay = CONTROLLER_MIN_BACKOFF_DELAY;

This modbus thing is the only use I have in my application for UART.

Debug Logs.

Guru Meditation Error: Core  0 panic'ed (Interrupt wdt timeout on CPU0). 

Core  0 register dump:
MEPC    : 0x40801e70  RA      : 0x40801c48  SP      : 0x4082b570  GP      : 0x4081e324  
--- 0x40801e70: uart_ll_update at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/hal/esp32c6/include/hal/uart_ll.h:102 (discriminator 1)
 (inlined by) uart_ll_force_xon at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/hal/esp32c6/include/hal/uart_ll.h:1319 (discriminator 1)
 (inlined by) resume_uarts at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:586 (discriminator 1)
 (inlined by) esp_sleep_start at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:1081 (discriminator 1)
0x40801c48: resume_uarts at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:585
 (inlined by) esp_sleep_start at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:1081

TP      : 0x4082b710  T0      : 0x40027bd4  T1      : 0x3ffff9b6  T2      : 0x3e080000  
--- 0x40027bd4: Cache_Resume_ICache in ROM

S0/FP   : 0x4081dc30  S1      : 0x00000001  A0      : 0xfffdffff  A1      : 0xfff7ffff  
A2      : 0x00000003  A3      : 0x40828000  A4      : 0x60001000  A5      : 0x00000001  
A6      : 0x4082cdec  A7      : 0x4082cdb4  S2      : 0x40828000  S3      : 0x00000000  
S4      : 0x00000010  S5      : 0x00000000  S6      : 0x00026227  S7      : 0x00000000  
S8      : 0x40828000  S9      : 0x40828000  S10     : 0x00000002  S11     : 0x4082afc4  
T3      : 0x00000000  T4      : 0x000008c0  T5      : 0x00000000  T6      : 0x00000000  
MSTATUS : 0x00001881  MTVEC   : 0x40800001  MCAUSE  : 0x00000018  MTVAL   : 0x00008b85  
--- 0x40800001: _vector_table at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/riscv/vectors_intc.S:54

MHARTID : 0x00000000  

Stack memory:
4082b570: 0x00000001 0xa5a5a5a5 0x00000000 0x4082b598 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b590: 0xa5a5a5a5 0xa5a5a5a5 0x00000000 0x00000028 0x00000004 0x0000000a 0x20000000 0x0c000000
4082b5b0: 0x00000000 0x00000000 0x40000000 0x00000000 0xc0000000 0x00000000 0x00000000 0x08000000
--- 0x40000000: _start in ROM

4082b5d0: 0xc0000000 0x08040000 0x00000000 0x00000000 0x00000000 0x00000000 0xc0000000 0x60400000
4082b5f0: 0x00000000 0x00004375 0x0000097a 0x001f001f 0x0000000f 0x00000046 0x1846001f 0x0f631f00
4082b610: 0x4080cee8 0x00000028 0x00000000 0x000a2780 0x4082b710 0x40821abc 0x00000000 0x00000000
--- 0x4080cee8: vPortClearInterruptMaskFromISR at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/riscv/port.c:528
 (inlined by) vPortExitCritical at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/freertos/FreeRTOS-Kernel/portable/riscv/port.c:629

4082b630: 0x00026227 0x00000000 0x00000000 0x08918fee 0x00007c10 0x00000000 0x00007c10 0x408020d0
--- 0x408020d0: esp_light_sleep_inner at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:1224

4082b650: 0x00007c10 0x00000000 0x4081dc30 0x40809c38 0x40828000 0x40828000 0x00000000 0x600b1c00
--- 0x40809c38: esp_light_sleep_start at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_hw_support/sleep_modes.c:1431

4082b670: 0x408226c4 0x00000103 0x40828000 0x40828000 0x4082194c 0x40828000 0x40828000 0x00000001
4082b690: 0x00000000 0x00000000 0x08918fd4 0x40803fde 0x408218fc 0x40828000 0x40828000 0x4080e98a
--- 0x40803fde: vApplicationSleep at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/esp_pm/pm_impl.c:821
0x4080e98a: prvIdleTask at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/freertos/FreeRTOS-Kernel/tasks.c:4380

4082b6b0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b6d0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b6f0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b710: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0x00000158 0x4082b610 0x00000000 0x40821954
4082b730: 0x40821954 0x4082b724 0x4082194c 0x00000019 0x00000000 0x00000000 0x4082b724 0x00000000
4082b750: 0x00000000 0x4082b120 0x454c4449 0x00000000 0x00000000 0x00000000 0x4082b710 0x00000003
4082b770: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x408287c0 0x40828828
4082b790: 0x40828890 0x00000000 0x00000000 0x00000001 0x00000000 0x00000000 0x00000000 0x42006cb4
--- 0x42006cb4: esp_cleanup_r at /home/juantxorena/Projects/esp/v5.3.1/esp-idf/components/newlib/newlib_init.c:43

4082b7b0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b7d0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b7f0: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b810: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b830: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b850: 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000 0x00000000
4082b870: 0x00000000 0x00000000 0x00000200 0x00000800 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b890: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b8b0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b8d0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b8f0: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b910: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b930: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5
4082b950: 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5 0xa5a5a5a5

More Information.

No response

esp-wzh commented 4 weeks ago

Hi, @juantxorena

According to my investigation, the direct cause of this problem is that the clock source selected by the modbus component when initializing the UART peripheral is UART_SCLK_DEFAULT, which on esp32c6 is the PLL clock source. When the chip just wakes up from sleep, the PLL clock source is unavailable, resulting in stuck when accessing the UART module.

Currently the modbus stack does not support the use with esp_pm's autolightsleep, please set light_sleep_enable to false when configure the esp_pm.

juantxorena commented 4 weeks ago

Hi @esp-wzh, thanks for the update. So I'm a bit confused by the modbus uart thing, according to the docs, this clock source is only set in the legacy UART driver, whatever that means. Should I open a bug report in the esp-modbus project then about this? Why is forcing this clock and not another?

I also will have the 32kHz external oscillator, and I will use it as the main root clock, but I was testing stuff in a breadboard and I didn't had it there. Would this help somehow?

Other than that, I need the autolightsleep, so I would need a workaround. Actually I only use modbus for reading data every X minutes (like 5 minutes or so), so I have some ideas:

Of course, the ideal thing would be to fix this problem. Would any of these solutions work?

esp-wzh commented 4 weeks ago

Should I open a bug report in the esp-modbus project then about this? Why is forcing this clock and not another?

Actually this problem should be fixed in the sleep process, I'm working on it.

I also will have the 32kHz external oscillator, and I will use it as the main root clock, but I was testing stuff in a breadboard and I didn't had it there. Would this help somehow?

Set the uart clock source to UART_SCLK_XTAL in 1 & 2 will switch to XTAL (40M main XTAL) clock source, then the watchdog will not be triggered during the wakeup process. But I'm not sure if modbus stack will work as expected, you can have a try.

When it's time to read, disable autosleep, wait for some time so the PLL clock source can go up (like 500ms, I can handle that), do whatever, and then activate the autosleep again.

When the UART clock source is PLL, the wake-up process will be stuck at the resume UART peripheral. This is a bug in the sleep code and has nothing to do with whether the application layer UART reads data.

Actually bring down the modbus and uart interface after reading data, and creating them again before reading, while also disabling temporarily the autosleep like before.

I think this is feasible.

Maybe I can do the UART part in the ULP? I don't know how or if this is possible at all, maybe modbus requires a non ulp uart?

The modbus stack is too large for ULP. It is too hard to implement it.

Or maybe use the LP UART port, which according to the specs has different clock sources, XTAL_D2_CLK and LP_FAST_CLK? Would it be possible?

As mentioned above, HP UART supports XTAL clock source, just select it, but there is no exported configuration, you need to manually modify the modbus code.

esp-wzh commented 4 weeks ago

/cc @alisitsyn

alisitsyn commented 4 weeks ago

Hi @juantxorena,

The UART autolightsleep feature is suppose to wakeup when the UART receives the certain number of logic 1. Refer to uart-wakeup-light-sleep-only

The assumptions that can be checked for this feature to work:

UART peripheral contains a feature which allows waking up the chip from 
light sleep when a certain number of positive edges on RX pin are seen. 
This number of positive edges can be set using [uart_set_wakeup_threshold()](https://docs.espressif.com/projects/esp-idf/en/v4.3.3/esp32/api-reference/peripherals/uart.html#_CPPv425uart_set_wakeup_threshold11uart_port_ti)
function. ** Note that the character which triggers wakeup (and any 
characters before it) will not be received by the UART after wakeup. 
This means that the external device typically needs to send an extra 
character to the ESP32 to trigger wakeup, before sending the data. **

Depending on the baud rate, a few characters after that will also not be received.

will be changing. To ensure that UART has correct Baud rate all the 
time, it is necessary to select a source clock which has a fixed 
frequency and remains active during sleep. For the supported clock 
sources of the chips, please refer to uart_sclk_t or soc_periph_uart_clk_src_legacy_t"
"GPIO9 should be configured as function_5 to wake up UART1),

Should I open a bug report in the esp-modbus project then about this? Why is forcing this clock and not another?

You may open the separate issue against esp-modbus and link it with this one. This is not a bug actually but unsupported feature of esp-modbus. The esp-modbus is component which works with many supported targets, and it is hard to address all possible issues on all targets with all features and their options due to legacy code interface. It is supposed that in some specific cases these issues can be resolved on user side.

Other than that, I need the autolightsleep, so I would need a workaround. Actually I only use modbus for reading data every X minutes (like 5 minutes or so), so I have some ideas:

Or maybe use the LP UART port, which according to the specs has different clock sources, XTAL_D2_CLK and LP_FAST_CLK? Would it be possible?

The Modbus can use LP_UART as well (checked). However, the UART configuration needs to be changed. I would propose some workaround to let Modbus stack function work correctly with autolightsleep feature enabled. Once the Modbus stack is initialized the UART configuration is set and UART clock is set to UART_SCLK_DEFAULT. In spite of this you can override the configuration of UART (call uart_param_config(overridden_config)) in your application and select a source clock which has a fixed frequency for autosleep feature to work. Please also note that the Modbus protocol does not send the extra bytes (preamble) to wake up the target and the UART may skip the first few bytes after wake up. I think this should work in your application. Can you try this and report the result?

We can talk about other solutions after this.

Thanks.

juantxorena commented 4 weeks ago

In spite of this you can override the configuration of UART (call uart_param_config(overridden_config)) in your application and select a source clock which has a fixed frequency for autosleep feature to work. Please also note that the Modbus protocol does not send the extra bytes (preamble) to wake up the target and the UART may skip the first few bytes after wake up.

@alisitsyn I've tried this and apparently it works, so thanks for the help. It's hard to tell because as I said before, this problem happens only sometimes, but I've left the device reading from modbus every 2 minutes for 40 minutes and it didn't happen a single time. For reference, I used the following config:

uart_config_t xUartConfig = {
    .baud_rate = UART_BAUD_RATE,
    .data_bits = UART_DATA_8_BITS,
    .parity = UART_PARITY_DISABLE,
    .stop_bits = UART_STOP_BITS_1,
    .flow_ctrl = UART_HW_FLOWCTRL_DISABLE,
    .rx_flow_ctrl_thresh = 2,
    .source_clk = UART_SCLK_RTC
};

Which is the values from the portserial_m.c and my values, with the RC_FAST as clock, which according to the docs it's never powered down.

Please also note that the Modbus protocol does not send the extra bytes (preamble) to wake up the target and the UART may skip the first few bytes after wake up.

I've seen in the docs that they recommend to send some data via UART to wake it up, but I don't know how to do this with modbus. For now I haven't got any problem, though, maybe because my code is the one always starting the communication with the modbus device.

The next step for me will be to try to use it with the LP UART. How would it work? Simply using the same uart_param_config trick as before? I'm not sure since the port will be now LP_UART_NUM_0, not UART_NUM_X, I don't know if modbus needs to check stuff there when configuring.

Thanks for the help anyway.

juantxorena commented 4 weeks ago

Update about the LP UART: apparently it works. I had to configure a normal UART with the same pins for modbus, and then I used the following:

    uart_config_t xUartConfig = {
        .baud_    uart_config_t xUartConfig = {
        .baud_rate = UART_BAUD_RATE,
        .data_bits = UART_DATA_8_BITS,
        .parity = UART_PARITY_DISABLE,
        .stop_bits = UART_STOP_BITS_1,
        .flow_ctrl = UART_HW_FLOWCTRL_DISABLE,
        .rx_flow_ctrl_thresh = 2,
        .source_clk = LP_UART_SCLK_DEFAULT
    };
    uart_param_config(LP_UART_NUM_0, &xUartConfig);

and it seems to be working. I had to disable the UART output and use USB/JTAG instead, and now I'm relying in the published messages to mqtt with the modbus data to check if it's working (since the auto sleep will be disabled while the jtag is connected). If I tried to configure modbus with the LP_UART port it failed and went into a reboot loop, so I had to do this not-so-clean way. It would be great if modbus could be configured directly with the LP_UART, but of course specifying then that it won't work with RS485, only with RS232.

alisitsyn commented 4 weeks ago

@juantxorena,

The next step for me will be to try to use it with the LP UART.

You can use the LP_UART_NUM_0 in the Modbus configuration and LP_UART_SCLK_LP_FAST in uart clock config.

Simply using the same uart_param_config trick as before? I'm not sure since the port will be now LP_UART_NUM_0, not UART_NUM_X, I don't know if modbus needs to check stuff there when configuring.

Unfortunately, it will be impossible to override the clocks on application level because with LP_UART_NUM_0 the default clocks will be incorrect. So, just change it in the portserial_m.c. Also note that the LP_UART_NUM_0 can use only the LP_GPIO numbers for pin configuration:

#define LP_U0RXD_GPIO_NUM 4
#define LP_U0TXD_GPIO_NUM 5
#define LP_U0RTS_GPIO_NUM 2
#define LP_U0CTS_GPIO_NUM 3

For now I haven't got any problem, though, maybe because my code is the one always starting the communication with the Modbus device.

Yes, this may be a reason.

I've seen in the docs that they recommend to send some data via UART to wake it up, but I don't know how to do this

This is possible to do this for Modbus on your custom devices with preamble but this will not satisfy the standard.

It would be great if modbus could be configured directly with the LP_UART, but of course specifying then that it won't work with RS485, only with RS232.

I was thinking about it already but unfortunately I had hard time to workaround all the legacy issues to allow the Modbus stack to work on all supported esp-idf versions with all possible UART configuration options. And yes, the RS485 is not supported in LP_UART but it can be used with the auto-switching RS485 variant.

alisitsyn commented 2 weeks ago

@juantxorena,

Do you have any updates related to this issue? Have you been able to solve the issues in your application? I think it is better to leave the code unchanged for now and address the PW in your application. The update to address the PW in esp-modbus is added into todo list. Would you agree with this?

juantxorena commented 2 weeks ago

@alisitsyn sorry for my late answer. Yes, it is fixed, as you suggested, modifying the portserial_m.c file to make it use the LP_UART and its default clock was enough. I'll wait for the modbus component to be upgraded, but for now it's OK.

alisitsyn commented 1 week ago

@juantxorena,

Thank you for feedback. This will be added to my task list to update in esp-modbus v2 with low priority. Unfortunately, it will take some time and will not be implemented in esp-modbus v1.0.x. Please consider using of v2 beta.