espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.77k stars 7.31k forks source link

ESP32 resets unexpectedly after two days of working perfectly without resetting (IDFGH-10012) #11290

Open Kunaalkk1 opened 1 year ago

Kunaalkk1 commented 1 year ago

Answers checklist.

IDF version.

v4.4

Operating System used.

Windows

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

CMD

Development Kit.

ESP32 Dev Module (via USB Bridge)

Power Supply used.

USB

What is the expected behavior?

My code was working perfectly as expected for two whole days, and on the third day of the test, the microcontroller resets unexpectedly.

It doesn't show any backtrace, or any verbose error via ESP_LOG. It just resets every few seconds.

What is the actual behavior?

For the past two days, it has been working perfectly, at its actual behavior.

The drivers I used are:

#include <stdio.h>
#include <string.h>
#include <driver/twai.h>
#include <driver/gpio.h>
#include <driver/uart.h>
#include <esp_log.h>
#include <freertos/FreeRTOS.h>
#include <freertos/task.h>

Following is my board configuration:

// UART CONFIGURATION
#define UART_PORT_NUM UART_NUM_0
#define UART_TXD GPIO_NUM_1
#define UART_RXD GPIO_NUM_3
#define UART_RTS UART_PIN_NO_CHANGE
#define UART_CTS UART_PIN_NO_CHANGE
#define UART_BUF_SIZE 2048

#ifdef CONFIG_UART_ISR_IN_IRAM
#define INTR_ALLOC_FLAGS ESP_INTR_FLAG_IRAM
#else
#define INTR_ALLOC_FLAGS 0
#endif

// CAN CONFIGURATION
#define TWAI_TXD 5
#define TWAI_RXD 4
#define TWAI_QUEUE_LEN 8
#define CLK_OUT_IO TWAI_IO_UNUSED
#define BUS_OFF_IO TWAI_IO_UNUSED

// LED CONFIGURATION
#define LED_BUILTIN 2
#define GPIO_LEVEL_HIGH 1
#define GPIO_LEVEL_LOW 0

Following are my driver configurations:

// Configuration of UART
uart_config_t UART_CONFIG = {
    .baud_rate = 115200,
    .data_bits = UART_DATA_8_BITS,
    .parity = UART_PARITY_DISABLE,
    .stop_bits = UART_STOP_BITS_1,
    .flow_ctrl = UART_HW_FLOWCTRL_DISABLE,
    .rx_flow_ctrl_thresh = 122,
    .source_clk = (uart_sclk_t)80000000};

// Configuration of CAN
twai_general_config_t TWAI_GENERAL_CONFIG = {
    .mode = TWAI_MODE_NORMAL,
    .tx_io = TWAI_TXD,
    .rx_io = TWAI_RXD,
    .clkout_io = CLK_OUT_IO,
    .bus_off_io = BUS_OFF_IO,
    .tx_queue_len = TWAI_QUEUE_LEN,
    .rx_queue_len = TWAI_QUEUE_LEN,
    .alerts_enabled = TWAI_ALERT_ALL,
    .clkout_divider = 0};

twai_timing_config_t TWAI_TIMING_CONFIG = TWAI_TIMING_CONFIG_500KBITS();
twai_filter_config_t TWAI_FILTER_CONFIG = TWAI_FILTER_CONFIG_ACCEPT_ALL();

// Config LED GPIO
gpio_config_t led_conf = {
    .intr_type = GPIO_INTR_DISABLE,
    .mode = GPIO_MODE_OUTPUT,
    .pin_bit_mask = 1ULL << LED_BUILTIN,
    .pull_down_en = GPIO_PULLDOWN_DISABLE,
    .pull_up_en = GPIO_PULLUP_DISABLE};

Steps to reproduce.

I am writing firmware to convert UART to CAN data and vice versa.

Debug Logs.

NOTE: You'll see some bizarre characters such as "–@DD“@”@•@ ò ö ô" which is nothing but CAN data converted into UART. But there is no backtrace or verbose output that can tell why or when the error came. Also note that it only resets when UART communication is on.

ets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (112) cpu_start: Pro cpu up.
I (113) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (126) cpu_start: Pro cpu start user code
I (127) cpu_start: cpu freq: 160000000
I (127) cpu_start: Application information:
I (131) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (137) cpu_start: App version:      1
I (141) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (147) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (153) cpu_start: ESP-IDF:          v4.4.4-dirty
I (159) heap_init: Initializing. RAM available for dynamic allocation:
I (166) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (172) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (178) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (185) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (191) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (199) spi_flash: detected chip: generic
I (202) spi_flash: flash io: dio
I (207) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CP¥.‘@þîets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (112) cpu_start: Pro cpu up.
I (113) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (126) cpu_start: Pro cpu start user code
I (127) cpu_start: cpu freq: 160000000
I (127) cpu_start: Application information:
I (131) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (137) cpu_start: App version:      1
I (141) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (147) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (153) cpu_start: ESP-IDF:          v4.4.4-dirty
I (159) heap_init: Initializing. RAM available for dynamic allocation:
I (166) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (172) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (178) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (185) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (191) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (199) spi_flash: detected chip: generic
I (202) spi_flash: flash io: dio
I (207) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.7[0•@òöô’@DDî‘@þî•@òöô”@—@‘@þî“@–@DD@А@Е@òöô‘@þî–@DD’@DDî@Е@ôðî,
–@DD“@”@•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
—@–@DD‘@þî‘@þîets Jul 29 2019 12:21:46

rst:0x1 (POWERON_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (112) cpu_start: Pro cpu up.
I (113) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (126) cpu_start: Pro cpu start user code
I (127) cpu_start: cpu freq: 160000000
I (127) cpu_start: Application information:
I (131) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (137) cpu_start: App version:      1
I (141) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (147) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (153) cpu_start: ESP-IDF:          v4.4.4-dirty
I (159) heap_init: Initializing. RAM available for dynamic allocation:
I (166) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (172) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (178) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (185) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (191) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (199) spi_flash: detected chip: generic
I (202) spi_flash: flash io: dio
I (207) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.7[0’@DDî•@òöô–@DD“@–@DD”@”@‘@þî‘@þî”@ets Jul 29 2019 12:21:46

rst:0x8 (TG1WDT_SYS_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (139) cpu_start: Pro cpu up.
I (139) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (153) cpu_start: Pro cpu start user code
I (153) cpu_start: cpu freq: 160000000
I (153) cpu_start: Application information:
I (158) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (164) cpu_start: App version:      1
I (168) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (174) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (180) cpu_start: ESP-IDF:          v4.4.4-dirty
I (186) heap_init: Initializing. RAM available for dynamic allocation:
I (193) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (199) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (205) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (211) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (218) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (225) spi_flash: detected chip: generic
I (229) spi_flash: flash io: dio
I (234) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.v[0–@DD•@úòô,
•@ðþö,
•@öøö,
–@DD•@òöô,
•@ôðî,
•@úòô,
ets Jul 29 2019 12:21:46

rst:0x8 (TG1WDT_SYS_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (139) cpu_start: Pro cpu up.
I (139) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (153) cpu_start: Pro cpu start user code
I (153) cpu_start: cpu freq: 160000000
I (153) cpu_start: Application information:
I (158) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (164) cpu_start: App version:      1
I (168) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (174) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (180) cpu_start: ESP-IDF:          v4.4.4-dirty
I (186) heap_init: Initializing. RAM available for dynamic allocation:
I (193) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (199) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (205) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (211) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (218) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (225) spi_flash: detected chip: generic
I (229) spi_flash: flash io: dio
I (234) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.6[0“@•@òöô•@úòô•@úòô,
–@DD”@’@DDî—@@А@Ð’@DDî•@ðþö,
•@öøö,
–@DD•@ðþö,
•@öøö,
–@DD”@—@•@ðþö,
•@öøö,
–@DD”@•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
–@DD@Е@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
–@DD“@‘@þî–@DD—@”@‘@þî“@•@òöô•@òöô@А@Е@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
—@•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
—@“@”@’@DDî”@•@ðþö,
•@öøö,
–@DD’@DDî“@•@úòô,
•@ðþö,
•@öøö,
–@DD•@úòô,
•@ðþö,
•@öøö,
–@DD“@@Ð’@•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
—@•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
•@òöô,
•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
—@@З@”@”@–@DD@Ð’@DDu0•@òöô—@‘@þî’@DDî•@òöô”@”@–@DD@Б@þî–@DD@Е@ðþö,
•@öøö,
‘@•@ôðî,
•@úòô,
•@ðþö,
•@öøö,
–@DDets Jul 29 2019 12:21:46

rst:0x8 (TG1WDT_SYS_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (139) cpu_start: Pro cpu up.
I (139) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (153) cpu_start: Pro cpu start user code
I (153) cpu_start: cpu freq: 160000000
I (153) cpu_start: Application information:
I (158) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (164) cpu_start: App version:      1
I (168) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (174) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (180) cpu_start: ESP-IDF:          v4.4.4-dirty
I (186) heap_init: Initializing. RAM available for dynamic allocation:
I (193) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (199) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (205) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (211) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (218) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (225) spi_flash: detected chip: generic
I (229) spi_flash: flash io: dio
I (234) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.6[0’@DD“@”@@З@’@DDî•@òöô“@‘@þî‘@þî’@DDî‘@þî–@DD—@”@‘@þîets Jul 29 2019 12:21:46

rst:0x8 (TG1WDT_SYS_RESET),boot:0x17 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:1184
load:0x40078000,len:12880
load:0x40080400,len:3036
entry 0x400805e4
I (139) cpu_start: Pro cpu up.
I (139) cpu_start: Starting app cpu, entry point is 0x40081134
I (0) cpu_start: App cpu up.
I (153) cpu_start: Pro cpu start user code
I (153) cpu_start: cpu freq: 160000000
I (153) cpu_start: Application information:
I (158) cpu_start: Project name:     pyCANv1.0_TWAI_CAN
I (164) cpu_start: App version:      1
I (168) cpu_start: Compile time:     Apr 28 2023 11:33:28
I (174) cpu_start: ELF file SHA256:  9f67f840ee215b2b...
I (180) cpu_start: ESP-IDF:          v4.4.4-dirty
I (186) heap_init: Initializing. RAM available for dynamic allocation:
I (193) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (199) heap_init: At 3FFB2DC8 len 0002D238 (180 KiB): DRAM
I (205) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (211) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (218) heap_init: At 4008C9FC len 00013604 (77 KiB): IRAM
I (225) spi_flash: detected chip: generic
I (229) spi_flash: flash io: dio
I (234) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.6[0–@DD—@–@DD

More Information.

I tried it on two devices. For the first two days, both devices were working perfectly, for the third day, both have same problem. @espressif-abhikroy @espressif-bot @Espressif-liuuuu @espressif-zhanghu @esp-cjh @Esp-Doc @ESP-iPENCIL @esp-jiangguangming @esp-lis If anyone can please help?

Thanks in advance.

Regards, Kunaal

ginkgm commented 1 year ago

Hi @Kunaalkk1 ,

The issue you proposed is basically a interrupt watchdog triggered. And seeing from your log the issue is a bit critical: the ISR which catches the issue can't be even executed. So the hardware directly resets the system.

See: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/system/wdts.html?highlight=watchdog#interrupt-watchdog-timer-iwdt

Before we can effectively handle the issue, we need to find out which module causes the problem. There is a feature to show the PC of each core when the system resets: https://github.com/espressif/esp-idf/blob/master/components/bootloader_support/src/esp32/bootloader_esp32.c#L122.

But unfortunately it hasn't been displayed in your log. I guess it's because you have set BOOTLOADER_LOG_LEVEL to very low. Could you adjust that and provide more log?

We need:

  1. Your IDF accurate commit sha
  2. Your sdkconfig file
  3. Information about the address where it crashes. You may provide elf file of your project (but it may include secrets of your projects), or use addr2line with the address and the elf to get the information.
Kunaalkk1 commented 1 year ago

@ginkgm I have disabled the watchdog timer in make menuconfig. This problem is coming after that.

Also, the watchdog timer does not reset the esp32, it only interrupts the processor.

As you can see in my Debug Logs section, there is no backtrace generated before the system crashes, so there is no point in sharing the elf.

Further, VS Code has set the BOOTLOADER_LOG_LEVEL as per its default. In fact, ALL parameters of SDK Config are default.

Anyway, I did some debugging. When I commented out line from my code: uart_flush(UART_NUM_0) this issue was resolved. There has to be some issue in this function, please check it out.

BUT MY OVERALL PROBLEM IS NOT SOLVED.

I have a CAN Tool to send data from CAN and it is converting it to UART and writing my UART frames correctly.

However, when I send data from UART, I convert it to twai_message_t and transmit it to CAN.

esp_err_t ret = twai_transmit(&msg, 10 / portTICK_PERIOD_MS);
if(ret != ESP_OK)
{
    gpio_set_level(LED_BUILTIN, GPIO_LEVEL_HIGH);
}

The above code is supposed to turn on the LED when the message transmission fails, which is exactly what is happening.

Since the exact same code I am using has been tested and verified as OK, I do not believe that it is a problem with the firmware. However, I will try to print the id, DLC, and data to see if there is some mistake in that.

Please check my BOARD CONFIG and DRIVER CONFIG that I have enclosed in the first message above.

Regards, Kunaal