espressif / esp-idf

Espressif IoT Development Framework. Official development framework for Espressif SoCs.
Apache License 2.0
13.71k stars 7.3k forks source link

[NVS] updating a key adds it multiple times in memory instead of updating the previous one (is that normal ?!!) (IDFGH-5487) #7221

Closed petermagdy94 closed 3 years ago

petermagdy94 commented 3 years ago

Environment

Problem Description

I ran the example nvs_rw_value, and it's working well and values are read and write as expected. BUT the problem is how this value is saved in memory.

I noticed that when calling nvs_set_i32() or nvs_commit() -i'm not sure which one makes the problem-, the same key is stored in a new place in the memory keeping the previous one. And so after some time the nvs partition is filled with multiple values for this key and overflow the partition again.

Expected Behavior

one key with updated value in the NVS flash.

Actual Behavior

multiple keys of the same key with different values in the NVS flash.

Steps to reproduce

  1. make a copy of the nvs_rw_value example from the IDF repo to another location.
  2. open git bash in this location (or cmd, ......) and run the following commands.
  3. . (path_to_your_idf_directory)/export.sh
  4. idf.py build
  5. idf.py -p COM3 erase_flash
  6. idf.py -p COM3 flash
  7. idf.py -p COM3 monitor
  8. monitor the updated values with each reset and after a while close the monitor and run the following command.
  9. esptool.py -p COM3 read_flash 0x9000 0x5000 ./dump.bin
  10. open this file in any binary online viewer (https://h.markbuild.com/doc/binary-viewer.html) or run command xxd ./dump.bin

Code to reproduce this issue

#include <stdio.h>
#include "freertos/FreeRTOS.h"
#include "freertos/task.h"
#include "esp_system.h"
#include "nvs_flash.h"
#include "nvs.h"

void app_main(void)
{
    // Initialize NVS
    esp_err_t err = nvs_flash_init();
    if (err == ESP_ERR_NVS_NO_FREE_PAGES || err == ESP_ERR_NVS_NEW_VERSION_FOUND) {
        // NVS partition was truncated and needs to be erased
        // Retry nvs_flash_init
        ESP_ERROR_CHECK(nvs_flash_erase());
        err = nvs_flash_init();
    }
    ESP_ERROR_CHECK( err );

    // Open
    printf("\n");
    printf("Opening Non-Volatile Storage (NVS) handle... ");
    nvs_handle_t my_handle;
    err = nvs_open("storage", NVS_READWRITE, &my_handle);
    if (err != ESP_OK) {
        printf("Error (%s) opening NVS handle!\n", esp_err_to_name(err));
    } else {
        printf("Done\n");

        // Read
        printf("Reading restart counter from NVS ... ");
        int32_t restart_counter = 0; // value will default to 0, if not set yet in NVS
        err = nvs_get_i32(my_handle, "counter", &restart_counter);
        switch (err) {
            case ESP_OK:
                printf("Done\n");
                printf("Restart counter = %d\n", restart_counter);
                break;
            case ESP_ERR_NVS_NOT_FOUND:
                printf("The value is not initialized yet!\n");
                break;
            default :
                printf("Error (%s) reading!\n", esp_err_to_name(err));
        }

        // Write
        printf("Updating restart counter in NVS ... ");
        restart_counter++;
        err = nvs_set_i32(my_handle, "counter", restart_counter);
        printf((err != ESP_OK) ? "Failed!\n" : "Done\n");

        // Commit written value.
        // After setting any values, nvs_commit() must be called to ensure changes are written
        // to flash storage. Implementations may write to storage at other times,
        // but this is not guaranteed.
        printf("Committing updates in NVS ... ");
        err = nvs_commit(my_handle);
        printf((err != ESP_OK) ? "Failed!\n" : "Done\n");

        // Close
        nvs_close(my_handle);
    }

    printf("\n");

    // Restart module
    for (int i = 3; i >= 0; i--) {
        printf("Restarting in %d seconds...\n", i);
        vTaskDelay(1000 / portTICK_PERIOD_MS);
    }
    printf("Restarting now.\n");
    fflush(stdout);
    esp_restart();
}

Debug Logs

Opening Non-Volatile Storage (NVS) handle... Done
Reading restart counter from NVS ... Done
Restart counter = 279
Updating restart counter in NVS ... Done
Committing updates in NVS ... Done

Restarting in 3 seconds...
Restarting in 2 seconds...
Restarting in 1 seconds...
Restarting in 0 seconds...

Restarting now.

ets Jun  8 2016 00:22:57

rst:0xc (SW_CPU_RESET),boot:0x13 (SPI_FAST_FLASH_BOOT)
configsip: 0, SPIWP:0xee
clk_drv:0x00,q_drv:0x00,d_drv:0x00,cs0_drv:0x00,hd_drv:0x00,wp_drv:0x00
mode:DIO, clock div:2
load:0x3fff0030,len:6952
load:0x40078000,len:14292
load:0x40080400,len:3688
entry 0x40080678
I (27) boot: ESP-IDF v4.3 2nd stage bootloader
I (27) boot: compile time 02:47:05
I (27) boot: chip revision: 1
I (30) boot_comm: chip revision: 1, min. bootloader chip revision: 0
I (37) boot.esp32: SPI Speed      : 40MHz
I (41) boot.esp32: SPI Mode       : DIO
I (46) boot.esp32: SPI Flash Size : 2MB
I (50) boot: Enabling RNG early entropy source...
I (56) boot: Partition Table:
I (59) boot: ## Label            Usage          Type ST Offset   Length
I (67) boot:  0 nvs              WiFi data        01 02 00009000 00006000
I (74) boot:  1 phy_init         RF data          01 01 0000f000 00001000
I (82) boot:  2 factory          factory app      00 00 00010000 00100000
I (89) boot: End of partition table
I (93) boot_comm: chip revision: 1, min. application chip revision: 0
I (100) esp_image: segment 0: paddr=00010020 vaddr=3f400020 size=07cc8h ( 31944) map
I (121) esp_image: segment 1: paddr=00017cf0 vaddr=3ffb0000 size=02bc0h ( 11200) load
I (126) esp_image: segment 2: paddr=0001a8b8 vaddr=40080000 size=05760h ( 22368) load
I (138) esp_image: segment 3: paddr=00020020 vaddr=400d0020 size=18d44h (101700) map
I (176) esp_image: segment 4: paddr=00038d6c vaddr=40085760 size=05ecch ( 24268) load
I (187) esp_image: segment 5: paddr=0003ec40 vaddr=50000000 size=00010h (    16) load
I (194) boot: Loaded app from partition at offset 0x10000
I (194) boot: Disabling RNG early entropy source...
I (207) cpu_start: Pro cpu up.
I (207) cpu_start: Starting app cpu, entry point is 0x4008111c
I (194) cpu_start: App cpu up.
I (222) cpu_start: Pro cpu start user code
I (222) cpu_start: cpu freq: 160000000
I (222) cpu_start: Application information:
I (226) cpu_start: Project name:     nvs-rw-value
I (232) cpu_start: App version:      1
I (236) cpu_start: Compile time:     Jul  2 2021 02:44:44
I (242) cpu_start: ELF file SHA256:  ab026702d3eca396...
I (248) cpu_start: ESP-IDF:          v4.3
I (253) heap_init: Initializing. RAM available for dynamic allocation:
I (260) heap_init: At 3FFAE6E0 len 00001920 (6 KiB): DRAM
I (266) heap_init: At 3FFB3420 len 0002CBE0 (178 KiB): DRAM
I (272) heap_init: At 3FFE0440 len 00003AE0 (14 KiB): D/IRAM
I (279) heap_init: At 3FFE4350 len 0001BCB0 (111 KiB): D/IRAM
I (285) heap_init: At 4008B62C len 000149D4 (82 KiB): IRAM
I (292) spi_flash: detected chip: generic
I (296) spi_flash: flash io: dio
W (300) spi_flash: Detected size(4096k) larger than the size in the binary image header(2048k). Using the size in the binary image header.
I (314) cpu_start: Starting scheduler on PRO CPU.
I (0) cpu_start: Starting scheduler on APP CPU.

Opening Non-Volatile Storage (NVS) handle... Done

Reading restart counter from NVS ... Done
Restart counter = 280
Updating restart counter in NVS ... Done
Committing updates in NVS ... Done

Restarting in 3 seconds...
Restarting in 2 seconds...

Other items if possible

nvs_dump part of the nvs partition dump read

build_and_dump_files.zip

igrr commented 3 years ago

@petermagdy94 This is normal — NVS appends new key-value pairs instead of immediately overwriting the existing ones as a flash wear levelling measure. This allows erasing one 4kB flash sector only when it fills up, and not for each value update. NVS keeps track of the state of the key-value pair (not written yet / written and valid / invalidated) using a bitmap at the beginning of the flash sector. Appending the new entry and marking the previous entry as invalid can be done without doing a sector erase.

There is a description of how NVS works here, please take a look: https://docs.espressif.com/projects/esp-idf/en/latest/esp32/api-reference/storage/nvs_flash.html#internals.

petermagdy94 commented 3 years ago

Ok, but is it good for production ? because i also noticed in my project which is using other libraries that also write in nvs partition, each time i write and read the partition, keys are moved to other places and removed from its previous one.

igrr commented 3 years ago

What libraries do you mean by "other libraries that also write in nvs partition"?

There is a situation when it might look like the key-value pair is moved. If the page is full, and new key-value pair can not be appended to the same page, then the value will be written to a new page. Afterwards, the key-value pairs in the old page may get garbage-collected, and the old page will be erased. As a result, you will only see the key being present in the new page.

The approach is suitable for production use. In fact, the log-based approach is also used to ensure that NVS partition can always be recovered if the device is powered off while an operation is in progress, and no key-value pairs will be lost. Keep in mind that for production use we recommend enabling the hardware security features: flash encryption and secure boot. Starting from IDF 4.3, NVS encryption will be enabled automatically if flash encryption is enabled.

negativekelvin commented 3 years ago

If you are worried about access times and fragmentation then you may want to consider using multiple nvs partitions or some kind of ram cache

petermagdy94 commented 3 years ago

@igrr actually my project works on arduino platform but i found that the problem is from the lower layer of idf platform, so i mentioned the issue here. the library that writes in nvs is wifimanager [https://github.com/tzapu/WiFiManager.git] or the WiFi library itself. BTW from the documentation and your help, i think there is no chance to corrupt a key-value pair in run time after a bin file is flashed to nvs partition by the factory for production.

petermagdy94 commented 3 years ago

@negativekelvin yes you are right, that's another solution but i wanted to make sure that one nvs partition will not make a problem.

igrr commented 3 years ago

If you have some NVS data flashed in factory, I would recommend using two NVS partitions:

This will guarantee that no matter what happens with the main NVS partition, the factory settings will not be modified. This setup also works well with the factory reset feature, if you are using it.

For an example of how to use multiple NVS partitions, please check the related section in esp-jumpstart: https://docs.espressif.com/projects/esp-jumpstart/en/latest/manufacturing.html#multiple-nvs-partitions.

petermagdy94 commented 3 years ago

thank you all for your help, i used second nvs partition and it works well.