littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
5.13k stars 791 forks source link

can't save file larger than block size with w25q128jv nor flash #880

Open metroluke opened 11 months ago

metroluke commented 11 months ago

Hey guys!

I have a file with 4104 bytes that I can't save in my w25q128jv nor flash. There is no error code. My block size is configured as 4096. I made a test with a 4092 bytes file and it worked. Also, I configured my block size to 8192 and it started to work with my 4104 B file. I can't maintain block size as 8192 because my memory doesn't support a 8k block erase. Is it supposed my file size limit to be my block size?

cfg.read = w25qxx_block_device_read; cfg.prog = w25qxx_block_device_prog; cfg.erase = w25qxx_block_device_erase; cfg.sync = w25qxx_block_device_sync;

cfg.lock = w25qxx_lock; cfg.unlock = w25qxx_unlock;

cfg.read_buffer = LFS_read_buf; cfg.prog_buffer = LFS_prog_buf; cfg.lookahead_buffer = LFS_lookahead_buf;

cfg.read_size = 256; cfg.prog_size = 256; cfg.lookahead_size = 1024;

cfg.block_size = 4096; cfg.block_count = 4096;

cfg.block_cycles = 500; cfg.cache_size = 256;

file_cfg.buffer = LFS_cache_buf; file_cfg.attrs = NULL; file_cfg.attr_count = 0;

Thanks!!

geky commented 11 months ago

Hi @metroluke, thanks for reporting an issue.

Usually this indicates an issue in the block device layer. Maybe the block_count is configured incorrectly or the address calculation has a bug.

I would run tests like these to see if the underlying block device is working as expected (I realize these aren't super user-consumable atm): https://github.com/littlefs-project/littlefs/blob/master/tests/test_bd.toml

I can't maintain block size as 8192 because my memory doesn't support a 8k block erase.

You can always implement a larger erase with a smaller erase by calling the erase function multiple times. Erasing first the first block and then the second block.

But I wouldn't suggest this route. This bug sounds like it will show up again even with a larger block size after more filesystem operations. Eventually wear-leveling will try to use other blocks for example.

metroluke commented 11 months ago

Hey @geky, thanks for your reply.

Usually this indicates an issue in the block device layer. Maybe the block_count is configured incorrectly or the address calculation has a bug.

My block_count is correct for sure, but about the address calculation, I have a question: do I need to treat cases which I need to split a prog/read larger than my prog/read_size on my bd layer? Because I simply calculate the address as *(block block_size) + offset** and I trust littlefs will assure everything matches. I did this because I read on other issue here that this is OK if your cache_size == prog/read_size.

I would run tests like these to see if the underlying block device is working as expected (I realize these aren't super user-consumable atm): https://github.com/littlefs-project/littlefs/blob/master/tests/test_bd.toml

I'm running on a STM32 + RTOS enviorement and I don't think I can execute exactly this. But I will try to build something like this.

Thank you!!

geky commented 11 months ago

I have a question: do I need to treat cases which I need to split a prog/read larger than my prog/read_size on my bd layer?

If I understand correctly this may be the issue.

littlefs may call read/prog with multiples of the read_size/prog_size. It respects the alignment of read_size/prog_size, and does not cross block_size boundaries, but may be larger than a single read_size/prog_size.

For example, say prog_size = 32. littlefs may do any of these:

But it will never do any of these because of invalid alignment:

If your block device has a limit on how much data can be read/progged in a single operation, you may need a loop to read/prog everything.

I did this because I read on other issue here that this is OK if your cache_size == prog/read_size.

If you remember where you saw this let me know and I can edit/comment on it. At one point littlefs behaved a bit differently, but being able to read/prog multiples is a valuable optimization.

The main benefit being that block devices can take advantage of larger/more efficient read/prog operations without artificially limiting the read/prog granularity.

CSC-Sendance commented 8 months ago

Hi,

I have the same chip and try to write it via SPI. My observations: the manufacturer says a block is 64kB big. This differs from your block configuration. I don't think read/progs should be an issue as long as you don't increase cache_size beyond the maximum flash's prog size (256 bytes). At least I haven't seen it exceed the 256 bytes without it.

However, I believe the block size configuration is still possible if you do a Sector Erase (4KB) instead of a block erase or even a manual erase (prog with 0xff filled buffer) in your w25qxx_block_device_erase.

The 64kB blocks are anyway problematic (for me) because as soon as littlefs switches to a new block, it has to erase the new one before it writes it. This causes something like 200+ ms delays for me when it calls the BLOCK_ERASE command. The block allocator doesn't seem to be the problem as far as I have seen.

Accordingly, I tried implementing a block size of 512 by also implementing the littlefs' user-defined device_erase via 2x manual 256 byte "progs" (each with the same filled 0xff-bytebuffer of 256 bytes length). This results in MUCH more stable write times while (probably, didn't test it) being more inefficient overall (min: 1ms, max: 3ms, avg: ~2ms with 300 byte writes instead of a max of 200ms as soon as the "block-border" is reached since erasing takes so much less time per call.

geky commented 8 months ago

Accordingly, I tried implementing a block size of 512 by also implementing the littlefs' user-defined device_erase via 2x manual 256 byte "progs" (each with the same filled 0xff-bytebuffer of 256 bytes length)

Any appearance of this working is a lie and the universe is being mean to you.

Generally flash works by large erase operations setting memory to all ones 0x?? -> 0xff and then smaller prog operations masking zeros 0xff & 0xcc -> 0xcc.

If you try to "erase" by programming 0xffs again, it just ends up a noop: 0xcc & 0xff -> 0xcc.

Sidenote: not all chips support masking writes. On some chips this can result in lower data retention. I think the w25q being NOR is fine, but you should check the datasheet if you rely on masking writes. littlefs does not rely on masking writes.

I suspect the reason this appears to work is 1. the flash happened to already be erased, either from the factory or from previous operations, and 2. not enough writes have happened for littlefs to reuse previous storage. If you continue to write to the filesystem data corruption will happen eventually.


If I'm mistaken and you're triggering an erase some other way, you shouldn't actually need to write 0xffs to disk. littlefs doesn't care about the actual contents after an erase, just that the contents won't change and are programmable. This helps integrate with RAM/FTLs/encryption/etc.

My observations: the manufacturer says a block is 64kB big. This differs from your block configuration.

I think this is just a terminology issue. In storage these terms are a mess, with "block/sector/cluster/page/cylinder" meaning different and sometimes conflicting things across technologies. My understanding is a "block" on NOR flash is a part of its internal structure, and it may be faster to erase a full NOR-block at a time if you need to erase 64KiB, but since you can erase 4KiB NOR-sectors at a time, there's no reason to use NOR-blocks as littlefs's block_size.

Do let me know if there's any downsides to erasing NOR-sectors vs NOR-blocks, as far as I'm aware there isn't any.

The term "block" is commonly used in filesystems to represent an arbitrary logical unit that may or may not have physical disk-related constraints. littlefs adopted this, though maybe it would have been better to name this erase_size. The names prog_size and read_size were chosen to try to avoid some of the confusion.

I don't think read/progs should be an issue as long as you don't increase cache_size beyond the maximum flash's prog size (256 bytes). At least I haven't seen it exceed the 256 bytes without it.

littlefs will also bypass the cache if you do something like:

lfs_file_write(&lfs, &file, really_big_buffer, 4096);

So that's something else to watch out for.

CSC-Sendance commented 8 months ago

Accordingly, I tried implementing a block size of 512 by also implementing the littlefs' user-defined device_erase via 2x manual 256 byte "progs" (each with the same filled 0xff-bytebuffer of 256 bytes length)

Any appearance of this working is a lie and the universe is being mean to you.

Generally flash works by large erase operations setting memory to all ones 0x?? -> 0xff and then smaller prog operations masking zeros 0xff & 0xcc -> 0xcc.

If you try to "erase" by programming 0xffs again, it just ends up a noop: 0xcc & 0xff -> 0xcc.

Sidenote: not all chips support masking writes. On some chips this can result in lower data retention. I think the w25q being NOR is fine, but you should check the datasheet if you rely on masking writes. littlefs does not rely on masking writes.

I suspect the reason this appears to work is 1. the flash happened to already be erased, either from the factory or from previous operations, and 2. not enough writes have happened for littlefs to reuse previous storage. If you continue to write to the filesystem data corruption will happen eventually.

If I'm mistaken and you're triggering an erase some other way, you shouldn't actually need to write 0xffs to disk. littlefs doesn't care about the actual contents after an erase, just that the contents won't change and are programmable. This helps integrate with RAM/FTLs/encryption/etc.

My observations: the manufacturer says a block is 64kB big. This differs from your block configuration.

I think this is just a terminology issue. In storage these terms are a mess, with "block/sector/cluster/page/cylinder" meaning different and sometimes conflicting things across technologies. My understanding is a "block" on NOR flash is a part of its internal structure, and it may be faster to erase a full NOR-block at a time if you need to erase 64KiB, but since you can erase 4KiB NOR-sectors at a time, there's no reason to use NOR-blocks as littlefs's block_size.

Do let me know if there's any downsides to erasing NOR-sectors vs NOR-blocks, as far as I'm aware there isn't any.

The term "block" is commonly used in filesystems to represent an arbitrary logical unit that may or may not have physical disk-related constraints. littlefs adopted this, though maybe it would have been better to name this erase_size. The names prog_size and read_size were chosen to try to avoid some of the confusion.

I don't think read/progs should be an issue as long as you don't increase cache_size beyond the maximum flash's prog size (256 bytes). At least I haven't seen it exceed the 256 bytes without it.

littlefs will also bypass the cache if you do something like:

lfs_file_write(&lfs, &file, really_big_buffer, 4096);

So that's something else to watch out for.

Interesting - thanks so much for those hints! And yes, the whole block/sector/etc. terminology is very confusing. And you are obviously correct, further testing then resulted in errors. So apparently this "erasing" did not work.

Overall, this probably means that (in my case, where erasing is the apparent bottleneck to achieve constant write-speeds without outliers) the smallest actual erase operation supported will stay the absolute bound for performance on a "block change". Kinda bad to see those 45-50 milliseconds for a sector erase (4kB) or 220 for a block erase (64kB) when I would need a page erase with 1-2ms ;).

geky commented 7 months ago

Kinda bad to see those 45-50 milliseconds for a sector erase (4kB) or 220 for a block erase (64kB) when I would need a page erase with 1-2ms ;).

Ah yeah, erases are usually the main bottleneck with flash.

There is some work going on to improve this. You always have to pay for erases eventually, but at least in theory you can do some number of erases early so it doesn't any time-sensitive write operations, assuming you have enough idle time. This is a bit trickier with littlefs's design than a logging filesystem though.

CSC-Sendance commented 7 months ago

Kinda bad to see those 45-50 milliseconds for a sector erase (4kB) or 220 for a block erase (64kB) when I would need a page erase with 1-2ms ;).

Ah yeah, erases are usually the main bottleneck with flash.

There is some work going on to improve this. You always have to pay for erases eventually, but at least in theory you can do some number of erases early so it doesn't any time-sensitive write operations, assuming you have enough idle time. This is a bit trickier with littlefs's design than a logging filesystem though.

Sounds reasonable. Hope to see this feature soon :)

I tried quite a few caching strategies but eventually the problem still appears when there is a cache refresh that is triggered here: https://github.com/littlefs-project/littlefs/blob/f53a0cc961a8acac85f868b431d2f3e58e447ba3/lfs.c#L2960 . This sometimes causes a "hard-"read during an ongoing erase operation which is thus blocking.

Eventually, we kinda gave up and are currently thinking about / prototyping switching to a NAND, rather than a NOR-flash since this should improve write speeds by an order of magnitude. However, I also saw this issue, so it may not solve the issue after all. More testing is required, I guess :)

geky commented 7 months ago

However, I also saw https://github.com/littlefs-project/littlefs/issues/935, so it may not solve the issue after all. More testing is required, I guess :)

Ah yeah, with the poor scalability on littlefs's part I'm not sure I can suggest NAND for performance reasons. Some users have success with metadata_max (described in https://github.com/littlefs-project/littlefs/issues/935), but it's still a gamble who wins, the NAND or littlefs.

This is also something that's being worked on. littlefs was written for NOR first and NAND's geometry is quite a bit more challenging.

Though I'd be interested to know what you find with NAND vs NOR.

CSC-Sendance commented 3 months ago

However, I also saw #935, so it may not solve the issue after all. More testing is required, I guess :)

Ah yeah, with the poor scalability on littlefs's part I'm not sure I can suggest NAND for performance reasons. Some users have success with metadata_max (described in #935), but it's still a gamble who wins, the NAND or littlefs.

This is also something that's being worked on. littlefs was written for NOR first and NAND's geometry is quite a bit more challenging.

Though I'd be interested to know what you find with NAND vs NOR.

Hi.. so much later ;) I posted some updates with a few benchmarks in the mentioned issue. Short summary: NAND is still, overall for our more or less write-only usecase, so much faster due to much faster erases. However, there are still write time outliers with a distinct pattern that is also dependent on the file size (or overall memory occupation)? I guess it's better to continue the discussion in the other thread - fits better to the NAND :)