zephyrproject-rtos / zephyr

Primary Git Repository for the Zephyr Project. Zephyr is a new generation, scalable, optimized, secure RTOS for multiple hardware architectures.
https://docs.zephyrproject.org
Apache License 2.0
10.51k stars 6.43k forks source link

Flash memory partitions not being initialized correctly by settings_subsys_init() #26212

Open lsangild opened 4 years ago

lsangild commented 4 years ago

I believe this to be related to #19880

Describe the bug I am attempting to implement the CANopenNode library in Zephyr into my project, but is having issues creating the NVS needed for the EEPROM part. My device is the stm32f4_disco, and I have added the following to the device tree file:

&flash0 {
    /* FLASH memory layout in reference manual Table 5. */
    partitions {
        compatible = "fixed-partitions";
        #address-cells = <1>;
        #size-cells = <1>;

        /*
         * Used by CANopen for the Object Dictionary.
         * 32 kB at sectors 0 and 1 (each 16 kB large).
         * Two sectors must be used for NVS.
         */
        storage_partition: partition@800000 {
            label = "storage";
            reg = <0x800000 0x00008000>;
        };
    };
};

and is using a prj.conf with these relevant parameters (inspired by the CANopen examples):

# CAN Settings
CONFIG_CAN=y
CONFIG_CAN_MAX_FILTER=13

# Memory settings, used for CANopen
CONFIG_FLASH=y
CONFIG_FLASH_MAP=y
CONFIG_FLASH_PAGE_LAYOUT=y
CONFIG_NVS=y
CONFIG_SETTINGS=y
CONFIG_SETTINGS_NVS=y
CONFIG_MPU_ALLOW_FLASH_WRITE=y

# CANopen settings
CONFIG_CANOPEN=y
CONFIG_CANOPEN_SYNC_THREAD=y
CONFIG_CANOPEN_STORAGE=y

The project compiles but when running settings_subsys_init() I get error -EDOM. This seems to be because hw_flash_sector does not get updated with values during the call to rc = flash_area_get_sectors(DT_FLASH_AREA_STORAGE_ID, &sector_cnt, &hw_flash_sector);

int flash_area_get_sectors(int idx, u32_t *cnt, struct flash_sector *ret)
{
    struct layout_data data;

    return flash_area_layout(idx, cnt, ret, get_sectors_cb, &data);
}

Where an inaccessible layout_data does get populated with data matching my setup in the dts file. It seems as if the data stored in layout_data should instead be stored in hw_flash_sector (ret), as this is the one getting checked for sector size.

nvs_sector_size = CONFIG_SETTINGS_NVS_SECTOR_SIZE_MULT *
    hw_flash_sector.fs_size;

if (nvs_sector_size > UINT16_MAX) {
    return -EDOM;
}

Expected behavior settings_subsys_init() returns 0.

Impact Annoyance. I can just not check if the subsys initialization worked, but that seems rather risky...

Environment (please complete the following information):

Additional context When debugging this, I can see that the Object Dictionary provided is indeed correct, but is stored completely in SRAM. I am unsure if this is due to the initialization not completing. Additionally it seems that CONFIG_SETTINGS_NVS_SECTOR_SIZE_MULT = 1

lsangild commented 4 years ago

I have found that the Object Dictionary is indeed stored in the allocated slot at 0x8037b38, and I had missed the structure of it, which points to FLASH when something is stored in EEPROM and to SRAM when something is stored in RAM/ROM. Still unsure as to if the reaction of settings_subsys_init() is a bug or me doing something wrong.

nvlsianpu commented 4 years ago

Please try tests/subsys/storage/flash_map/ on your board.

nvlsianpu commented 4 years ago

Is the storage are consist only pages of the same size? NVS doesn't support non-regular flash layout.

nvlsianpu commented 4 years ago

https://github.com/zephyrproject-rtos/zephyr/issues/19895

lsangild commented 4 years ago

Testing tests/subsys/storage/flash_map/ Using the original stm32f4_disco board I get an infinite loop, due to no storage partition original.log When I specify a partition called "storage" I get an error about missing "image_1" my_board.log When renaming my partition to "image_1" I get: image_1.log which compiles.

&flash0 {
    /* FLASH memory layout in reference manual Table 5. */
    partitions {
        compatible = "fixed-partitions";
        #address-cells = <1>;
        #size-cells = <1>;

        /*
         * Used by CANopen for the Object Dictionary.
         * 32 kB at sectors 2 and 3 (each 16 kB large).
         * Two sectors must be used for NVS.
         */
        storage_partition: partition@8000 {
            label = "image_1";
            reg = <0x0008000 0x8000>;
        };
    };
};

Running on the board I get console Which fits with my checks of the memory, that the memcpycommands in the subsystem setup (checking of the NVS) does not write anything to my file system. It seems that the hardware bits are set and the FLASH is NOT write protected.

lsangild commented 4 years ago

I have tried placing my NVS at the end of the flash in sectors 10 and 11, but it seems as if the NVS can only handle 16 kB sectors, at least there is a check in settings_backend_init which sets

if (nvs_sector_size > UINT16_MAX) {
    return -EDOM;
}

And when I place them in sectors 0 and 1, the memcpy fails for both sectors, which makes nvs_startup fail as it finds no open sectors.

Memory Map STM32F40x flash map

lsangild commented 4 years ago

I see now that I forgot the argument to include the CONFIG_MPU_ALLOW_FLASH_WRITE=y. The test suit now succeeds. Testing the canopen sample in samples/subsys/canbus/canopen the program stalls at ret = settings_subsys_init(); (tested with printk before and after the call, only the first is printed into terminal).

nvlsianpu commented 4 years ago

missing: CONFIG_MPU_ALLOW_FLASH_WRITE=y

lsangild commented 4 years ago

missing: CONFIG_MPU_ALLOW_FLASH_WRITE=y

Heh, yeah, found that out just 10 minutes ago. Now the flash_map test works. See my last comment for my next attempt using the canopen sample.

nvlsianpu commented 4 years ago

I assuming that now you are using sectors 10 and 11 for the storage. Please ensure that this area was erased before you start the sample 1st time.

lsangild commented 4 years ago

I have inspected the memory and found it empty (all 0xFF). Also cleared it with the STM32CubeProgrammer to ensure it was done, but still gets

*** Booting Zephyr OS build zephyr-v2.2.0  ***
[00:00:00.000,000] <inf> can_driver: Init of CAN_2 done
[00:00:00.004,000] <err> app: failed to initialize settings subsystem (err = -37)

after programming (only once).

nvlsianpu commented 4 years ago

^^ forgot sectors are to large. Can yo try 16 KB sectors?

otherwise there is PR which aim to support big sectors: https://github.com/zephyrproject-rtos/zephyr/pull/23876

lsangild commented 4 years ago

Programming the canopen sample using sectors 2 and 3 as below

&flash0 {
    /* FLASH memory layout in reference manual Table 5. */
    partitions {
        compatible = "fixed-partitions";
        #address-cells = <1>;
        #size-cells = <1>;

        /*
         * Used by CANopen for the Object Dictionary.
         * 32 kB at sectors 2 and 3 (each 16 kB large).
         * Two sectors must be used for NVS.
         */
        storage_partition: partition@8000 {
            label = "storage";
            reg = <0x0008000 0x8000>;
        };
    };
};

The sample stalls in ret = settings_subsys_init(); which is the same place my program stops working. While debugging my program I can see that at least the memory blocks 1, 2, 3 and 4 are NOT empty already as my main loop starts. I have not declared any threads or done anything which should put data there.

lsangild commented 4 years ago

Seems that this fails because flash_page_foreach sets cb_data->status=-12 in flash_area_layout

nvlsianpu commented 4 years ago

Normally executable code is placed beginning from the start of flash. You need to relocated the code somehow to make place for the storage (i don't know whether this is possible in this infrastructure). An option (kind of tank for killing an ant) is to use mcuboot and appropriate partition arrangement for achieve that. I believe this STM32 has reset vector relocation support so mayby there is an simpler way for move executable in zephyr-rtosd environment (ask for stm help)

lsangild commented 4 years ago

So using mcuboot it should be possible to move the data stored in the start of flash, to a higher location, e.g. sector 5, clear sectors 1-4 and then have mcuboot start execution of the program from the new address?

nvlsianpu commented 4 years ago

should be possible to: use sector 0 and 1 for mcuboot. use sector 2 and 3 for the storage use rest of flash for two same size image partitions, I believe you have to leave sector 4 empty.

lsangild commented 4 years ago

Hmmmm, does seem like overkill for the job. As I have an external EEPROM available the solution (probably until nvs supports larger sectors), will be to sync the CANopen Object Dictionary with the values in the external EEPROM at startup.

Should I close this issue as a pull request is already available?

nvlsianpu commented 4 years ago

^^ puted question on the zephyr slack

nvlsianpu commented 4 years ago

https://zephyrproject.slack.com/archives/C18PLHN8H/p1592395936067300

lsangild commented 4 years ago

https://zephyrproject.slack.com/archives/C18PLHN8H/p1592395936067300

I am unable to sign in to this slack, as I am not affiliated with any of the listed companies. image

nvlsianpu commented 4 years ago

Didn't be aware that access to it is limited:

I had big problems with flash & vector table offsets for the stm32f412 my solution so far: create own(out-of-tree) board + (out-of-tree) shield for example: myTest_0x2000.overlay &flash0 { partitions { compatible = "fixed-partitions";

address-cells = <1>;

size-cells = <1>;

slot0_partition: partition@20000 { label = "zephyr-code"; reg = <0x08020000 DT_SIZE_K(384)>; }; }; }; Now link your board + shield and project config together. I give as example 0x20000 byte offset CMakeLists.txt set(SHIELD "myTest_0x2000") proj.conf CONFIG_FLASH_LOAD_OFFSET=0x20000 This setup generates the dts defines and more important the CONFIG_FLASH_LOAD_OFFSET=0x20000 makes the arm compiler placing the vector table at the 0x20000 + ?200?(I am not sure) offset (edited)

see also #18157 if you got troubles

karstenkoenig commented 4 years ago

Slack access isn't limited, there is an open invite linked on this page. https://docs.zephyrproject.org/latest/guides/getting-help.html

Actual invite link https://tinyurl.com/y5glwylp

Laczen commented 4 years ago

@lsangild, the flash properties of these devices make it difficult to support flash storage. There are basically 2 options:

a. Use sectors of 128 kB for storage: at least two sectors are needed to ensure that whatever happens no data is lost. So at least 256 kB is used for storing even the smallest amount of data. Reading will generally be slow when the flash gets used more, and whenever a data copy/compression is needed this will be very slow. Also some of the devices with this kind of flash have only limited large sectors. --> This is probably a bad idea.

b. Use the sectors of 16kB for storage: this should not have the disadvantages of a., but... these sectors are at the flash start and are taken by mcuboot. In most cases mcuboot is larger than 32 kB, so there is no space left for two sectors of storage. --> This is impossible.

The best solution for these devices would require an extra piece of software, a first stage loader that is placed in the very first sector (sector 0). This first stage loader would then jump to mcuboot located in sector 4. The first stage loader should be kept smaller than 16kB so that sectors 1, 2 and 3 can be used for storage.

lsangild commented 4 years ago

@Laczen @nvlsianpu Thank you for the solution examples.

This just gets worse when I tell you that my project is meant to also span the STM32H7, which has a FLASH layout consisting of only 128 kB sectors... We do have, fortunately, an external EERAM (RAM + EEPROM), which I can use for non volatile storage, I just have to direct the data there manually.

Laczen commented 4 years ago

@Laczen @nvlsianpu Thank you for the solution examples.

This just gets worse when I tell you that my project is meant to also span the STM32H7, which has a FLASH layout consisting of only 128 kB sectors... We do have, fortunately, an external EERAM (RAM + EEPROM), which I can use for non volatile storage, I just have to direct the data there manually.

From a quick scan of the STM32H7 datasheet I would conclude that the sectorsize is 8kB.

lsangild commented 4 years ago

Can you elaborate on where you find this? In the Reference Manual page 151 I find that:

The embedded Flash non-volatile memory is composed of:
• For STM32H742/743/753 devices: a 2-Mbyte main memory block, organized as two
banks of 1 Mbyte each. Each bank is in turn divided in eight 128-Kbyte sectors and
features Flash-word rows of 256 bits + 10 bits of ECC per word.
Laczen commented 4 years ago

Can you elaborate on where you find this? In the Reference Manual page 151 I find that:

The embedded Flash non-volatile memory is composed of:
• For STM32H742/743/753 devices: a 2-Mbyte main memory block, organized as two
banks of 1 Mbyte each. Each bank is in turn divided in eight 128-Kbyte sectors and
features Flash-word rows of 256 bits + 10 bits of ECC per word.

I was reading the STM32H7A3/B3 and B0 manual, for the STM32H742/743/753 devices the sector size indeed seems to be 128kB.

aurel32 commented 4 years ago

The best solution for these devices would require an extra piece of software, a first stage loader that is placed in the very first sector (sector 0). This first stage loader would then jump to mcuboot located in sector 4. The first stage loader should be kept smaller than 16kB so that sectors 1, 2 and 3 can be used for storage.

I got the same issue on STM32F7. Fortunately you can change the boot base address in FLASH_OPTCR1 flash register for boot pin = 0 with BOOT_ADD0 and for boot pin = 1 with BOOT_ADD1. Of course the device tree on the Zephyr side should be updated to match that value.

nvlsianpu commented 4 years ago

Summarized: issue now is that NVS doesn't support sector bigger than 16 KB. This is not a bug, but NVS limitation which can be improved by https://github.com/zephyrproject-rtos/zephyr/pull/23876.

Laczen commented 4 years ago

@nvlsianpu, removing this nvs limitation will result in a new issue: NVS is slow on sectors of 128kB (especially garbage collection).

Also on some devices there are only a limited amount of 128kB sectors and you need at least two leaving almost no place for the application.

NVS has been written to store many small items and not a few large ones, this does not match good with large sectors.

The solution that really solves the problem is proposed by @aurel32 or to use a small first stage loader allows mcuboot to reside at a different location and not taking up the small sectors at flash start. These small sectors are ideally suited for NVS use.

I would summarize the issue as: "MCUboot and/or zephyr are putting to strict requirements on flash layout". And yes, this is a bug.