Open BenBE opened 9 months ago
Hi @BenBE, thanks for creating an issue.
This is quite an interesting use of ASAN.
I'd be concerned that littlefs's view of the world (uninitialized memory is random but static) vs C/ASAN's view (uninitialized memory is bad bad bad bad bad) would lead to false positives, but this looks like it found a real issue.
Also thanks for posting detailed steps to reproduce.
- There is a chance, that certain code paths read memory despite the block layer returning that memory to be corrupted
I think this is a non-issue. We do the le32 conversion before checking for an error, but we don't actually use the value after that.
The reason for the weird before-error conversion is because we do the same pattern for writes. For writes, the words are sometimes in-use by internal structs. We need to do le32 conversion to/from for these words before error-checking or else in-RAM state can become corrupted.
It's all a bit of a hack but fortunately we don't need to do this for reads, so while I think this le32 conversion has no impact, we can move the le32 conversion to after the error check to improve static analysis.
Though we may need to do this in more places than this.
- When only the second block of a
dir_pair
is readable AND the revision in thatdir_pair
is negative, the sector is ignored, despite being the only usable one.
This looks like a legitimate bug. Revision counts should be compared with sequence arithmetic so "negative" values are still valid.
I think this hasn't been hit until now because most triggers of LFS_ERR_CORRUPT don't write to the buffer, leaving the revision count as 0. By the time this overflows no corrupted blocks in the metadata pair remain, though this can still be hit by a block device driver that writes arbitrary values to the buffer on LFS_ERR_CORRUPT.
I'll try to get a fix into the next patch release.
- There is a chance, that certain code paths read memory despite the block layer returning that memory to be corrupted
I think this is a non-issue. We do the le32 conversion before checking for an error, but we don't actually use the value after that.
Based on my experience, it's often better to delay processing information to the point when you know the information is valid. Else you end up with a kind of quantum state, where your "converted" data is both valid and invalid at the same time, until you performed your check. Reducing that time as much as sensible often allows for other debugging measures (like ASAN) to be much better with finding issues*.
I can understand that you should do some repeating patterns for similar checks as this eases locating issues. Preferably the patterns you use should be both consistent and correct. With the PoC for this issue, we toyed with this assumption to force things to go south quickly.
*There's actually a limitation in the way ASAN marks memory as readable/unreadable, which effectively caused the error to be reported later than it should have (the first call to lfs_rawmount
is already broken, but the report is somewhere in lfs_dir_fetchmatch
through lfs_rawopen
). The reason being that ASAN can't fully check each byte for access, but suffixes inside 8-byte groups.
2. When only the second block of a
dir_pair
is readable AND the revision in thatdir_pair
is negative, the sector is ignored, despite being the only usable one.This looks like a legitimate bug. Revision counts should be compared with sequence arithmetic so "negative" values are still valid.
I think this hasn't been hit until now because most triggers of LFS_ERR_CORRUPT don't write to the buffer, leaving the revision count as 0.
The issue is subtly different even. The PoC does not write to the buffer on LFS_ERR_CORRUPT
(only marks the buffer as poisoned for ASAN), but forces the valid part of the pair to have a negative sequence number. The bug that is triggered in this case is that the sequence number starts out as 0
and is kept there after the invalid first block is read (correct so far). When the second (valid) block is read, a negative sequence number is found and compared to that initial value of 0
(<-- bug here). This comparison fails, thus that block is not considered any further, despite being (the first) valid one. The initial index for the block to use is also set to 0
(the first block) and should have been updated to 1
, as the only valid block is the second one. This causes the subsequent code to access the (uninitialized) data of the first block to be processed instead of the information of the (properly read) second block.
Additionally there's a check missing if none of the blocks could be properly read (i.e. all returned LFS_ERR_CORRUPT
).
I hope this clarifies the situation with this bug a bit more. Especially as the "invalid read" is not with the le32 conversion (although that's usually a code smell at least), but below that loop when further processing is done.
Edit: The issue was spotted during (offline) code review while examining if you can cause undefined behaviour in the loop that reads the pairs; ASAN was only used to demonstrate its presence in the PoC, as that gives a clear "crash". The issue itself normally would only cause silent corruption, as all the memory accessed is from littlefs; thus the abuse of ASAN's memory poisoning to mark the buffers as invalid.
I was trying to reproduce locally and was having difficulty. I think this still can't result in a failed fetch.
Even if we pick up the wrong revision count, the later checks for checksum/tag validity will still toss out the corrupted block and fallback to fetching the correct block (here).
Which makes sense, the revision count can be a blatantly wrong value if powerloss caused a commit to only be partially written.
While reviewing littlefs we (@Maaxxs, @stoeckmann and me) noticed an issue, where littlefs reads from potentially uninitialized memory.
A simple way to reproduce this is to apply the following patch:
and compile a small demo program using that rambd driver with ASAN enabled:
(
thing.c
is the example from the README plus the necessary structs to uselfs_rambd.c
)When running
./thing
you get output like this:There are two issues in littlefs demonstrated by this PoC:
dir_pair
is readable AND the revision in thatdir_pair
is negative, the sector is ignored, despite being the only usable one.The first issue is a problem with how memory is handled when a sector could not be properly read.
The second issue is a bug in how the most recent revision is determined in
lfs_dir_fetchmatch
:If you have any questions or need more information, feel free to ask away. :)
thing.c
```c #include "lfs.h" #include "bd/lfs_rambd.h" // variables used by the filesystem lfs_t lfs; lfs_file_t file; struct lfs_rambd_config lfs_rambd_cfg = { // Minimum size of a read operation in bytes. .read_size = 4, // Minimum size of a program operation in bytes. .prog_size = 16, // Size of an erase operation in bytes. .erase_size = 4096, // Number of erase blocks on the device. .erase_count = 1024, // Optional statically allocated buffer for the block device. .buffer = NULL, }; lfs_rambd_t lfs_rambd; // configuration of the filesystem is provided by this struct struct lfs_config cfg = { .context = &lfs_rambd, // block device operations .read = lfs_rambd_read, .prog = lfs_rambd_prog, .erase = lfs_rambd_erase, .sync = lfs_rambd_sync, // block device configuration .read_size = 4, .prog_size = 16, .block_size = 4096, .block_count = 128, .cache_size = 16, .lookahead_size = 16, .block_cycles = 500, }; // entry point int main(void) { if (lfs_rambd_create(&cfg, &lfs_rambd_cfg) != 0) { puts("could not create rambd"); exit(1); } // mount the filesystem int err = lfs_mount(&lfs, &cfg); // reformat if we can't mount the filesystem // this should only happen on the first boot if (err) { lfs_format(&lfs, &cfg); lfs_mount(&lfs, &cfg); } // read current count uint32_t boot_count = 0; lfs_file_open(&lfs, &file, "boot_count", LFS_O_RDWR | LFS_O_CREAT); lfs_file_read(&lfs, &file, &boot_count, sizeof(boot_count)); // update boot count boot_count += 1; lfs_file_rewind(&lfs, &file); lfs_file_write(&lfs, &file, &boot_count, sizeof(boot_count)); // remember the storage is not updated until the file is closed successfully lfs_file_close(&lfs, &file); // release any resources we were using lfs_unmount(&lfs); // print the boot count printf("boot_count: %d\n", boot_count); lfs_rambd_destroy(&cfg); } ```