littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
4.9k stars 770 forks source link

Possibly redundant flush when seeking #996

Open stefano-zanotti opened 2 weeks ago

stefano-zanotti commented 2 weeks ago

When seeking (lfs_fileseek), LFS needs to flush out whatever is in its cache, and then refresh the cache with the block where the new file position is. However, if the file is in read mode, and the new position is in the same block as the one currently in cache, it can just change the position variable, without any disk operation. See here: https://github.com/littlefs-project/littlefs/blob/d01280e64934a09ba16cac60cf9d3a37e228bb66/lfs.c#L3697-L3720

The same could be done when writing too: if the new position is in the same block, we can avoid disk operations, and only trigger them on a subsequent sync/write. However, LFS currently skips this optimization: it always flushes the file, if the cache is dirty.

I tried removing the check on LFS_F_WRITING, but my software misbehaved in different parts of the code, so I assume this sync is needed for some other reason.

Is it really necessary? Can this optimizazion be implemented in some way? Is this optimization impossible for some reason due to the internals of LFS?

In my use case, this missed optimization has a big impact on performance, since LFS is called using a pattern like the following: SEEK 0, WRITE 2048 SEEK 0, WRITE 2048 SEEK 2048, WRITE 2048 SEEK 2048, WRITE 2048 etc.

That is, each write just needs to append 1KB at the end of the file, but it works with 2KB blocks, so it always rewrites a whole block (padded with 0s), unless it needs to switch to the next one. This causes LFS to flush to disk even if it was not asked to do so by a sync/close call: SEEK 0, flush, WRITE 2048 SEEK 0, flush, WRITE 2048 SEEK 2048, flush, WRITE 2048 SEEK 2048, flush, WRITE 2048

Instead, the flushes marked with * could be avoided: the data would be still in cache, and it would be overwritten by the following write. Here I'm assuming that LFS's cache size is 2048, but other flushes could be avoided too, if the cache was bigger.

geky commented 1 week ago

Hi @stefano-zanotti, thanks for creating an issue.

There is unfortunately a tradeoff between performance and code-size when it comes to leveraging caches. It's certainly simpler to just throw away the cache on seek.

This is probably a case where making the caching logic a bit smarter is warranted. But at the moment there is a significant amount of work related to files bubbling down the pipeline, so it may be better to wait and revisit this after larger data-structure-related performance issues are potentially shaken out.

stefano-zanotti commented 1 week ago

I understand this tradeoff.

Could you please just verify one thing? I tried replacing these lines: https://github.com/littlefs-project/littlefs/blob/d01280e64934a09ba16cac60cf9d3a37e228bb66/lfs.c#L3698-L3702 with a simple "true", to always perform the "skip re-caching if unneeded" optimization, but I had problems down the line. However, to me it seems it should be enough to fix my issue. If I'm right, the fix would be extremely simple even now. Or, are there other reasons I'm not seeing, for which this can't work?

geky commented 1 week ago

So, the main difference between LFS_F_READING and LFS_F_WRITING is that the data exists on disk when LFS_F_READING, but not when LFS_F_WRITING. The data only exists in the cache.

So it's safe to discard the data with LFS_F_READING, but if we discard the data with LFS_F_WRITING, we've lost the data.

Consider this sequence:

SEEK 0, WRITE 2048  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa
SEEK 0, WRITE 1024  bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaa

If our cache size is 2048, there is 1024 bytes in the range 1024-2048 that we lose track of. Unfortunately littlefs doesn't know how much data is going to be written at seek time, so it needs to be conservative and write everything out.


I may be wrong, but I think what you're looking for is a sort of write-cancellation API. There's is something in the works on this (lfs_file_desync), but it's only a WIP at the moment...

stefano-zanotti commented 1 week ago

I'm not sure lfs_file_desync would help me. Logically, I don't need to abort writes. I might exploit it to circumvent the problem, that's true, but I'm not really in control of the software that performs the write calls to LFS, so I can't tell when it would be safe to desync the file. Right now it seems that I could desync before any "seek backwards", but that software might change outside my control, or have edge cases, so the only practicable solution is to have LFS keep proper track of the cached data.

I still don't fully understand the issue. I'm not saying that on LFS_F_WRITING we should discard data; I'm saying that on LFS_F_WRITING we should just retain the current cache, with no reload nor flush, if the cache window doesn't change.

If we do:

SEEK 0      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 0, writing=0
            ^
WRITE 2048  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
                                            ^
SEEK 0      aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
            ^                               
WRITE 1024  bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
                            ^
CLOSE

LittleFS still knows that the file is 2048 bytes long (since overwriting, with the final WRITE, does not imply truncating), and since it flushes out the whole cache anyway, the CLOSE operation should output the whole "bb...ba...aa" data anyway, which would be correct. The fact that the position points at 1024 should be irrelevant. (I've marked the dirty portion of the cache, though I'm not sure whether LFS keeps track of it explicitly)

The same would work in this case:

SEEK 0      xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 0, writing=0
            ^
WRITE 2048  aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
                                            ^
SEEK 0      aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
            ^                               
WRITE 1024  bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
                            ^
SEEK 2048   bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaa   filesize 2048, writing=1, dirty [0, 2048]
                                            ^
WRITE 1024  bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaacccccccccccccccc   filesize 3072, writing=1, dirty [2048, 3072]
                                                            ^

The last write must detect that the cache window must move, so it should flush the current cache before fetching (and overwriting) the next window:

WRITE 1024
    (state when write is called)
    <-------------CACHE------------>
    bbbbbbbbbbbbbbbbaaaaaaaaaaaaaaaaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=1, dirty [0, 2048]
                                    ^
    (flush cache to disk)
    <-------------CACHE------------>
    BBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=0
                                    ^
    (load new cache window -- might not even load from disk, since we're past the end of the file, and just fill with 0s or 1s, or just leave "uninitialized")
                                    <-------------CACHE------------>
    BBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=0
                                    ^
    (perform the write -- in cache)
                                    <-------------CACHE------------>
    BBBBBBBBBBBBBBBBAAAAAAAAAAAAAAAAccccccccccccccccxxxxxxxxxxxxxxxx   filesize 3072, writing=1, dirty [2048, 3072]
                                                    ^

This behavior is needed even without using seek(), that is, when we just keep appending data to the file, so I assume that LFS already knows how to do it (I mean, it knows when to flush the cache to disk, if the new write needs to enter a new window).

    <-------------CACHE------------>
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 0, writing=0
    ^
WRITE 2048
    <-------------CACHE------------>
    aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=1, dirty [0, 2048]
                                    ^
WRITE 2048
    <-------------CACHE------------>
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=0
                                    ^
                                    <-------------CACHE------------>
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx   filesize 2048, writing=0
                                    ^
                                    <-------------CACHE------------>
    AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbbb   filesize 4096, writing=1, dirty [2048, 4096]
                                                                    ^

The second write must have flushed the data properly, as decribed above.

This exact behavior is what is needed in my case; only, the previous seek should not flush the cache (if the window doesn't move), and let the next write/close/sync operation handle the flushing.