espressif / esp-bsp

Board support components for Espressif development boards
Other
203 stars 105 forks source link

LVGL(managed components) causing a crash after a random time (IDFGH-14133) (BSP-587) #443

Open akashpraan opened 1 day ago

akashpraan commented 1 day ago

Answers checklist.

IDF version.

5.1.4

Espressif SoC revision.

ESP32 S3 Wroom 1U (16MB Flash)

Operating System used.

Windows

How did you build your project?

VS Code IDE

If you are using Windows, please specify command line type.

None

Development Kit.

Custom Board

Power Supply used.

USB

What is the expected behavior?

I expect the display to get updated properly everytime and not result in a crash after a random number of hours

What is the actual behavior?

The ESP resets after a random number of hours

I want to figure out why there is this failure and why it is affecting my system and getting the ESP to restart after a random amount of hours. The last time I observed it was after 12 hours of leaving the system ON and on 12 or 15th button press this issue was seen.

lvgl/lvgl: “^8.3.0” esp_lvgl_port: “^1” These have been added in dependencies in idf_component.yml

Steps to reproduce.

I am using a Black&White 0.96inch OLED Display to display status of Fan Mode and the sensor values

  1. There is a button(or via an MQTT Command), the user can change the Fan Mode and this is reflected on the OLED and after 5 seconds(using timer) Sensor data is display(the sensor data is refreshed every 2 minutes)
  2. On Some Custom PCB (4/40 for now), which doing extended testing we observe that after a certain while the ESP restarts citing that there has been an attempt to do “memcpy in ROM”
  3. Now, I tried reproducing the issue by pressing the button at close intervals (like 30 times a minute) and I didn’t observe a crash for 3-4 minutes straight and then I stopped, currently my system has been running normally for 3.5Hours with me repeating this 30 times a minute thing for 2 minutes every hour.

For one instance I got the error back trace from the ESP such as

Guru Meditation Error: Core  0 panic'ed (LoadProhibited). Exception was unhandled.
Core  0 register dump:
PC      : 0x420168ea  PS      : 0x00060b30  A0      : 0x820169db  A1      : 0x3fcd57e0
--- 0x420168ea: get_prop_core at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_style.c:613
A2      : 0x3fcaf568  A3      : 0x00000000  A4      : 0x00000000  A5      : 0x00000000
A6      : 0x00000002  A7      : 0xb33fffff  A8      : 0x820168bd  A9      : 0x3fcd57c0
A10     : 0x00000001  A11     : 0x00588f0e  A12     : 0x00000000  A13     : 0x00000000
A14     : 0x00060023  A15     : 0x00000003  SAR     : 0x0000001f  EXCCAUSE: 0x0000001c
EXCVADDR: 0x00000004  LBEG    : 0x40056f5c  LEND    : 0x40056f72  LCOUNT  : 0xffffffff
--- 0x40056f5c: memcpy in ROM
0x40056f72: memcpy in ROM

Backtrace: 0x420168e7:0x3fcd57e0 0x420169d8:0x3fcd5830 0x420150a5:0x3fcd5860 0x42014567:0x3fcd5890 0x42014900:0x3fcd58d0 0x420148e5:0x3fcd58f0 0x42014984:0x3fcd5910 0x42018995:0x3fcd5930 0x4201d695:0x3fcd5960 0x4201d746:0x3fcd5980 0x42010aac:0x3fcd59a0 0x40380006:0x3fcd59c0
--- 0x420168e7: get_prop_core at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_style.c:612
0x420169d8: lv_obj_get_style_prop at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_style.c:229        
0x420150a5: lv_obj_get_style_base_dir at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_style_gen.h:567
 (inlined by) lv_obj_get_scroll_left at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_scroll.c:167    
0x42014567: lv_obj_refr_size at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_pos.c:90
0x42014900: layout_update_core at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_pos.c:1135
0x420148e5: layout_update_core at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_pos.c:1130 (discriminator 3)
0x42014984: lv_obj_update_layout at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_obj_pos.c:316
0x42018995: _lv_disp_refr_timer at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/core/lv_refr.c:308
0x4201d695: lv_timer_exec at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/misc/lv_timer.c:313 (discriminator 2)
0x4201d746: lv_timer_handler at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/lvgl__lvgl/src/misc/lv_timer.c:109
0x42010aac: lvgl_port_task at C:/Akash/Firmwares/Hive/Hive_E_ESPIDF_514/Hive_E/managed_components/espressif__esp_lvgl_port/esp_lvgl_port.c:691
0x40380006: vPortTaskWrapper at C:/ESP/esp-idf/v5.1.4/esp-idf/components/freertos/FreeRTOS-Kernel/portable/xtensa/port.c:162

Debug Logs.

No response

More Information.

I added multiple checks like to not create an existing label, check if the screen is active.

The function to update the screen display_data() is called by a button Interrupt, MQTT task if there is a command or by Admin Task to update status every 2 minutes.

Is there some status that I need to check like a mutex sort of thing before transitioning between display data functions? I call the display data function like

display_data(disp_handle, "CONNECTING", 0, NULL, LV_ALIGN_CENTER); or display_data(disp_handle, "Deep Clean", 0, NULL, LV_ALIGN_CENTER);

display_data(disp_handle, "PM2.5", pm2_5, NULL, LV_ALIGN_CENTER);
void init_lcd()
{
    #if LCD_DISPLAY_DEBUG == ON
    ESP_LOGI("Display", "Install panel IO");
    #endif

    esp_lcd_panel_io_i2c_config_t io_config = 
    {
        .dev_addr = EXAMPLE_I2C_HW_ADDR,        // I2C address of the SSD1306 display.
        .control_phase_bytes = 1,               // Control phase bytes (as per SSD1306 datasheet).
        .lcd_cmd_bits = EXAMPLE_LCD_CMD_BITS,   // Command bits for the LCD (as per SSD1306 datasheet).
        .lcd_param_bits = EXAMPLE_LCD_CMD_BITS, // Parameter bits for the LCD (as per SSD1306 datasheet).
        .dc_bit_offset = 6,                     // Data/Command bit offset (as per SSD1306 datasheet).
    };

    ESP_ERROR_CHECK(esp_lcd_new_panel_io_i2c((esp_lcd_i2c_bus_handle_t)I2C_HOST, &io_config, &io_handle));

    panel_handle = NULL;

    esp_lcd_panel_dev_config_t panel_config = 
    {
        .bits_per_pixel = 1,                   // Number of bits per pixel (monochrome display).
        .reset_gpio_num = EXAMPLE_PIN_NUM_RST, // GPIO number for the reset pin.
    };

    ESP_ERROR_CHECK(esp_lcd_new_panel_ssd1306(io_handle, &panel_config, &panel_handle));
    ESP_ERROR_CHECK(esp_lcd_panel_reset(panel_handle));
    ESP_ERROR_CHECK(esp_lcd_panel_init(panel_handle));
    ESP_ERROR_CHECK(esp_lcd_panel_disp_on_off(panel_handle, true));

    const lvgl_port_cfg_t lvgl_cfg = ESP_LVGL_PORT_INIT_CONFIG();
    lvgl_port_init(&lvgl_cfg);

    const lvgl_port_display_cfg_t disp_cfg =
    {
        .io_handle = io_handle,                                // Handle for the panel IO interface.
        .panel_handle = panel_handle,                          // Handle for the panel.
        .buffer_size = EXAMPLE_LCD_H_RES * EXAMPLE_LCD_V_RES,  // Buffer size based on resolution.
        .double_buffer = true,                                 // Enable double buffering for smoother rendering.
        .hres = EXAMPLE_LCD_H_RES,                             // Horizontal resolution of the display.
        .vres = EXAMPLE_LCD_V_RES,                             // Vertical resolution of the display.
        .monochrome = true,                                    // Set display to monochrome mode.
        .rotation = 
        {
            .swap_xy = false,                                  // No swapping of X and Y coordinates.
            .mirror_x = false,                                 // No mirroring along the X-axis.
            .mirror_y = false,                                 // No mirroring along the Y-axis.
        }
    };

    disp_handle = lvgl_port_add_disp(&disp_cfg);

    const esp_lcd_panel_io_callbacks_t cbs = 
    {
        .on_color_trans_done = notify_lvgl_flush_ready, // Callback to notify when LVGL flush is ready.
    };
    esp_lcd_panel_io_register_event_callbacks(io_handle, &cbs, disp_handle);

    lv_disp_set_rotation(disp_handle, LV_DISP_ROT_180);         // Set the rotation of the display to 180 degrees.
}

void display_data(lv_disp_t *disp, const char* label_text, uint16_t value, const char* unit_symbol, lv_align_t alignment)
{
    #if LCD_DISPLAY_DEBUG == ON
    ESP_LOGI("DisplayData", "Entering display_data function");
    ESP_LOGI("DisplayData", "Parameters - label_text: %s, value: %d, unit_symbol: %s, alignment: %d", label_text, value, unit_symbol ? unit_symbol : "NULL", alignment);
    #endif

    // Get the active screen for the given display.
    lv_obj_t *scr = lv_disp_get_scr_act(disp);
    if (scr == NULL) {
        #if LCD_DISPLAY_DEBUG == ON
        ESP_LOGE("DisplayData", "Active screen is NULL");
        #endif
        return;
    }

    // If label_main does not exist, create it.
    if (label_main == NULL) {
        #if LCD_DISPLAY_DEBUG == ON
        ESP_LOGI("DisplayData", "Creating new main label");
        #endif
        label_main = lv_label_create(scr);
        lv_label_set_long_mode(label_main, LV_LABEL_LONG_WRAP);
        lv_obj_set_style_text_font(label_main, &font_articulate_regular_18_8bpp, LV_PART_MAIN | LV_STATE_DEFAULT);  // Use custom font
    }

    // Update the text of the main label.
    lv_label_set_text(label_main, label_text);

    // If label_value does not exist, create it.
    if (label_value == NULL) {
        #if LCD_DISPLAY_DEBUG == ON
        ESP_LOGI("DisplayData", "Creating new value label");
        #endif
        label_value = lv_label_create(scr);
        lv_label_set_long_mode(label_value, LV_LABEL_LONG_WRAP);
        lv_obj_set_style_text_font(label_value, &font_articulate_regular_26_8bpp, LV_PART_MAIN | LV_STATE_DEFAULT);  // Use custom font
    }

    char buf[20];   // Format the value and the unit symbol into the buffer.

    // Check if value is NULL or 0 (since I want to display nothing for NULL value)
    if (value == 0) 
    {
        // If value is 0, check if it's a text (unit_symbol) to display, else display nothing
        if (unit_symbol != NULL) {
            snprintf(buf, sizeof(buf), "%s", unit_symbol);  // Show the text if unit_symbol is not NULL
        } 
        else {
            snprintf(buf, sizeof(buf), " ");  // Display nothing if no unit_symbol
        }
    } 
    else {
        // For non-zero values, display normally
        if (unit_symbol != NULL) {
            snprintf(buf, sizeof(buf), "%d%s", value, unit_symbol);  // Show value with unit symbol
        } 
        else {
            snprintf(buf, sizeof(buf), "%d", value);  // Just display the numeric value
        }
    }

    lv_label_set_text(label_value, buf);        // Update the text of the value label.

    // Get screen dimensions
    lv_coord_t screen_width = lv_obj_get_width(scr);
    lv_coord_t screen_height = lv_obj_get_height(scr);

    // Set a fixed y-coordinate for vertical alignment (center of the screen)
    lv_coord_t label_y = screen_height / 2 - 9;  // Static vertical alignment 
    lv_coord_t value_y = screen_height / 2 - 15;  // Static vertical alignment 
    lv_coord_t label_x = 12;  // Set a fixed x-coordinate for the label (left-aligned), Fixed padding from the left

    lv_coord_t value_x = screen_width - 58; // Set a fixed x-coordinate for the value (right-aligned), Fixed padding from the right 
    lv_obj_align(label_main, LV_ALIGN_TOP_LEFT, label_x, label_y);      // Align the main label to the left, with fixed y-coordinate
    lv_obj_align(label_value, LV_ALIGN_TOP_LEFT, value_x, value_y);     // Align the value label to the right, with fixed y-coordinate

    #if LCD_DISPLAY_DEBUG == ON
    ESP_LOGI("DisplayData", "Exiting display_data function");
    ESP_LOGI("DisplayData", " ");
    #endif
}
igrr commented 12 hours ago

@akashpraan Your program is crashing because it tries to dereference a NULL pointer. Please check how to interpret panic output in this section of the docs: https://docs.espressif.com/projects/esp-idf/en/latest/esp32s3/api-guides/fatal-errors.html#loadprohibited-storeprohibited

For your case, I would recommend enabling GDB Stub and then waiting for the program to crash. Once the program has crashed it will enter the GDB session and you can inspect the variables to understand what happened.

On a brief look, it does seem like the issue happens due to the fact that you are manipulating LVGL state in display_data function from your task at the same time as LVGL is performing a refresh in its own task. Typically you need to hold the muted while modifying UI state outside of LVGL task.

akashpraan commented 4 hours ago

Hello @igrr I am using LVGL for the first time, what mutex are you talking about?

Sorry to bother with this questions, I tried finding the answer but couldn't.

tore-espressif commented 4 hours ago

I am using LVGL for the first time, what mutex are you talking about?

LVGL is not thread safe, you must call this function

https://github.com/espressif/esp-bsp/blob/48936a65c8bd5bad95de0582b5ad15e0f35558c3/components/esp_lvgl_port/include/esp_lvgl_port.h#L104

Before you call any LVGL function (LVGL functions start with lv_*). After you are done, call https://github.com/espressif/esp-bsp/blob/48936a65c8bd5bad95de0582b5ad15e0f35558c3/components/esp_lvgl_port/include/esp_lvgl_port.h#L110

These two functions take/release mutex that protects LVGL internal data while you are modifying the UI

So it should look like this

...
lvgl_port_lock(1000); // Take LVGL lock (implemented as mutex)
lv_*();               // Do what you need with your UI
lvgl_port_unlock();   // Release the lock
...
igrr commented 4 hours ago

Here is the related part of the docs, for reference: https://github.com/espressif/esp-bsp/blob/48936a65c8bd5bad95de0582b5ad15e0f35558c3/components/esp_lvgl_port/README.md#lvgl-api-usage