lfs.c:998:error: Corrupted dir pair at {0x1, 0x0}

hbZhao9 commented 4 years ago

Dear,

I ported the littlefs on STM32F4. After running hours, everything is fine. However, I met some random errror to fail mount the file system during startup. So I wrote a small loop to write data into the file system. The file name increase 1 every time. And I get the below erros(it failed after create 300 files successfully):

SPI_FLASH_READ 0x1004000 4096 SPI_FLASH_READ 0xE54000 4096 NorFlash: Erase at 0xE53000 NorFlash: write date at 0xE53000. size 256, NorFlash: write date at 0xE53100. size 256, chip id 0 NorFlash: write date at 0xE53200. size 256, chip id 0 NorFlash: write date at 0xE53300. size 256, chip id 0 NorFlash: write date at 0xE53400. size 256, chip id 0 NorFlash: write date at 0xE53500. size 256, chip id 0 NorFlash: write date at 0xE53600. size 256, chip id 0 NorFlash: write date at 0xE53700. size 256, chip id 0 NorFlash: write date at 0xE53800. size 256, chip id 0 NorFlash: write date at 0xE53900. size 256, chip id 0 NorFlash: write date at 0xE53A00. size 256, chip id 0 NorFlash: write date at 0xE53B00. size 256, chip id 0 NorFlash: write date at 0xE53C00. size 256, chip id 0 NorFlash: write date at 0xE53D00. size 256, chip id 0 NorFlash: write date at 0xE53E00. size 256, chip id 0 NorFlash: write date at 0xE53F00. size 256, chip id 0 SPI_FLASH_READ 0xE53000 4096 /tmp/test301--Done----- SPI_FLASH_READ 0x0 4096 SPI_FLASH_READ 0x1000 4096 SPI_FLASH_READ 0x0 4096 ../Src/Models/FileSystem/lfs.c:998:error: Corrupted dir pair at {0x1, 0x0} Fail to open file /tmp/test302 for writing, ret: -84 /tmp/test302--Done----- SPI_FLASH_READ 0x1000 4096 SPI_FLASH_READ 0x0 4096 ../Src/Models/FileSystem/lfs.c:998:error: Corrupted dir pair at {0x1, 0x0} Fail to open file /tmp/test303 for writing, ret: -84

My parameter are as follow : static uint8_t fs_read_buffer[4096] = {0}; static uint8_t fs_prog_buffer[4096] = {0}; static uint8_t fs_lookahead_buffer[4096] = {0};

static const struct lfs_config fs_cfg = { // block device operations .read = &block_device_read, .prog = &block_device_prog, .erase = &block_device_erase, .sync = &block_device_sync,

.read_size      = 4096,
.prog_size      = 4096,
.block_size     = 4096,
.block_count    = 32768,
.block_cycles   = 500,
.cache_size     = 4096,
.lookahead_size = 4096,

.read_buffer      = fs_read_buffer,
.prog_buffer      = fs_prog_buffer,
.lookahead_buffer = fs_lookahead_buffer,

};

I use latest version of lfs. The problem can been reproduced. Please tell me how I can fix it. Thanks.

regards

blomnik commented 4 years ago

Hello, @hbZhao9. Is it possible, that another part of your code (may be another task in OS) use LFS in the same time? As I know, when you locate *_buffer staticaly, only one file can be opened at the same time. May be it is a result of unintended memory/buffers destroying?

And another one. As I calculated, your flash are 128MB. Is it correct?

hbZhao9 commented 4 years ago

hi, @blomnik , thanks for your kind reply.

There are 4 external SPI norflash in system, each flash is 32 MB. So totally the flash size is 128MB. I metioned the limitation of using static memory. As it is used in an embedded system, I limit one file is opened at the same time to reduce the dynamic located memory.

For the current test, I only run a loop to keep writing into files. The below is the initial part for the file operations: ////////////////////////////////////////////////////////////////////////////////////// static lfs_t lfs = {0}; static uint8_t fs_file_buffer[4096] = {0}; static const struct lfs_file_config file_cfg = { .buffer = fs_file_buffer, };

lfs_mount(&lfs, &fs_cfg); //////////////////////////////////////////////////////////////////////////////////////

The test part is: ////////////////////////////////////////////////////////////////////////////////////// uint32_t data_buffer[1000]; uint8_t file_name[20]; lfs_file_t file; int32_t ret; lfs_mkdir(&lfs,"/tmp");

for (iloop = 0; iloop < 600; iloop++) { snprintf(file_name, 20, "/tmp/test%d", iloop); ret = lfs_file_opencfg(&lfs, &file, file_name, LFS_O_WRONLY | LFS_O_CREAT, &file_cfg); if (ret >= 0) { ret = lfs_file_write(&lfs, &file, (uint8_t *)data_buffer, sizeof(data_buffer)) if (ret < 0) { printf("Fail to write to file %s, ret: %ld\n", file_name, ret); }

    lfs_file_close(&lfs, &file);
}
else
{
    printf("Fail to open file %s for writing, ret: %ld\n", file_name, ret);
}

printf("%s--Done-----\n", file_name);
HAL_Delay(500);

} //////////////////////////////////////////////////////////////////////////////////////

And the error happens randomly. Last time it happened after 300 files created. And I test it again. It happend after 142 files created.

blomnik commented 4 years ago

Hmm... looks fine. I have some doubts about

There are 4 external SPI norflash in system, each flash is 32 MB.

May be implementation of "merging" this four chips to one "large flat" storage is not compatible with LFS? Or does not respect address offset? Or something else in this direction? Moreover, may be you are trying to utilize common bus for all of four chip simultaneously? IMHO, problem lays deeper: in implementation of cfg.read, cfg.write, cfg.erase, cfg.sync. But I can be wrong, of course :)

hbZhao9 commented 4 years ago

@blomnik Yes, I implemented the "merging" functions for these four chips as what you said.

To simplify the test case, I limit all the operation into one chip. the following is the new setting: .read_size = 4096, .prog_size = 4096, .block_size = 4096, .block_count = 8192, .block_cycles = 500, .cache_size = 4096, .lookahead_size = 1024,

However, the problem is still here after hundreds of files created successfully.

When the LFS_YES_TRACE is set to on, the trace information is as below: ///////////////////////////////////////////////////////////////////////////////////////////////////////// ../Src/Models/FileSystem/lfs.c:2389:trace: lfs_file_opencfg(200194DC, 2004826C, "/tmp/test5", 102, 0803C2FC {.buffer=200184DC, .attrs=00000000, .attr_count=0}) ../Src/Models/FileSystem/lfs.c:2526:trace: lfs_file_opencfg -> 0 ../Src/Models/FileSystem/lfs.c:2874:trace: lfs_file_write(200194DC, 2004826C, 200482C0, 4000) ../Src/Models/FileSystem/lfs.c:2998:trace: lfs_file_write -> 4000 ../Src/Models/FileSystem/lfs.c:2548:trace: lfs_file_close(200194DC, 2004826C) ../Src/Models/FileSystem/lfs.c:2735:trace: lfs_file_sync(200194DC, 2004826C) ../Src/Models/FileSystem/lfs.c:2787:trace: lfs_file_sync -> 0 ../Src/Models/FileSystem/lfs.c:2567:trace: lfs_file_close -> 0 /tmp/test5--Done-----

...

../Src/Models/FileSystem/lfs.c:2389:trace: lfs_file_opencfg(200194DC, 2004826C, "/tmp/test217", 102, 0803C2FC {.buffer=200184DC, .attrs=00000000, .attr_count=0}) ../Src/Models/FileSystem/lfs.c:998:error: Corrupted dir pair at {0x1, 0x0} ../Src/Models/FileSystem/lfs.c:2548:trace: lfs_file_close(200194DC, 2004826C) ../Src/Models/FileSystem/lfs.c:2735:trace: lfs_file_sync(200194DC, 2004826C) ../Src/Models/FileSystem/lfs.c:2740:trace: lfs_file_sync -> 0 ../Src/Models/FileSystem/lfs.c:2567:trace: lfs_file_close -> 0 ../Src/Models/FileSystem/lfs.c:2533:trace: lfs_file_opencfg -> -84 Fail to open file /tmp/test217 for writing, ret: -84 /tmp/test217--Done-----

/////////////////////////////////////////////////////////////////////////////////////////////////////////

When the error happen, the stack is :

lfs_dir_fetchmatch() at lfs.c:997 0x801c87e
lfs_dir_find() at lfs.c:1,167 0x801cc84
lfs_file_opencfg() at lfs.c:2,409 0x801e918

My concern is, if there are any problems in the implementations of read/write/erase/sync. Why does the problem happen after many files operations successfully?

blomnik commented 4 years ago

Hello. It will be helpful if you compile and run code with LFS_YES_TRACE macro. ;-)

hbZhao9 commented 4 years ago

Yes, it is helpful to figure out the issue location. However, it is sad that it is not useful to know the reason of problem.

hbZhao9 commented 4 years ago

To verify the lower hardware interface functions, I ported the FATFS instead of littlefs with same functions.(block_device_read / block_device_prog / block_device_erase / block_device_sync). The test application can create/write 500 files successfully.

I still do not understand the reason cause this issue. Could anyone show me the reason? Or could you show me what should do when it happen instead of reformat file system. Thanks.

frveee commented 3 years ago

hi，Have you solved this problem? I also met the same mistake.

shilpa-1992 commented 2 years ago

Hi , What was the root cause of this issue ? How was this fixed ?

Filip83 commented 2 years ago

Hello, I have had a similar problem. The root cause of this problem, in my case, was an error in the FLASH driver. The FLASH driver used 3B addressing mode for my FLASH chip (128MB), which was insufficient. So, switching the FLASH driver to 4B addressing mode helped resolve the issue.

SaiPraveen22 commented 2 years ago

Hello, I am STM32L5 microcontroller, 16MBytes Flash in QuadSPI mode and im using lfs im getting also the same error, but i think its due to incorrect device configuration can anyone tell me correct configuration for my memory and my memory type is Winbond W25Q128FV(waveshare W25Q DataFlash Board) im posting my configuration below. // block device configuration .read_size = 64, .prog_size = 64, .block_size = 512, .block_count = 256, .cache_size = 64, .lookahead_size = 64, .block_cycles = 500, can any one tell how to make this configuration. Thanking you in advance

blomnik commented 2 years ago

Your SPI flash has 4KB page size, not 512.

SaiPraveen22 commented 2 years ago

Does the error depends on block device configuration? i tried to change the configuration but error exists, i think memory is not mounting because i have the same error even the memory is disconnected, i getting the error given below littlefs/lfs.c:1228:error: Corrupted dir pair at {0x0, 0x1} littlefs/lfs.c:1886:debug: Bad block at 0x0 littlefs/lfs.c:1891:warn: Superblock 0x0 has become unwritable. Does anyone faced the issue, please help me resolving it.

blomnik commented 2 years ago

Does the error depends on block device configuration? Yes, it does. Every update of littlefs try to erase one block, which has to be strictly equal physical 'program block'.

chandrasekharmorisetti commented 1 year ago

Corrupted dir pair at {0x1, 0x0} in little fs

We experienced this error and it is as a result of erasing flash during initialisation. If this error is faced, please check if erasing of FS region is done any where in your code.

After erase, if lfs mount is done the operation will fail as the previous instances are erased in the flash. This leads to formatting again followed by remount. So erase of the chip to be avoided to make sure subsequent mount operations are successful

acassis commented 1 year ago

I'm facing same error on NuttX RTOS with LittleFS using W25Q512JV and STM32F777:

nsh> mount -t littlefs /dev/mtdblock0 /mnt
lfs_dir_fetchmatch: Corrupted dir pair at {0x1, 0x0}
nx_mount: ERROR: Bind method failed: -14
nsh: mount: mount failed: 14
nsh>

@chandrasekharmorisetti flash erasing didn't help:

nsh> flash_eraseall /dev/mtdblock0

nsh> mount -t littlefs /dev/mtdblock0 /mnt
lfs_dir_fetchmatch: Corrupted dir pair at {0x0, 0x1}
nx_mount: ERROR: Bind method failed: -14
nsh: mount: mount failed: 14
nsh>

It is strange because using FAT or SmartFS everything works fine!

chandu191 commented 1 year ago

@acassis You should not do this "flash_eraseall /dev/mtdblock0". By doing this you are erasing the blocks so mount is failing. Can you try without using the "flash_eraseall /dev/mtdblock0" command. Just give only mount command.

littlefs-project / littlefs

lfs.c:998:error: Corrupted dir pair at {0x1, 0x0} #461