Access to buffers returned as corrupted by the block layer

BenBE commented 9 months ago

While reviewing littlefs we (@Maaxxs, @stoeckmann and me) noticed an issue, where littlefs reads from potentially uninitialized memory.

A simple way to reproduce this is to apply the following patch:

diff --git a/bd/lfs_rambd.c b/bd/lfs_rambd.c
index a6a05727..d6282191 100644
--- a/bd/lfs_rambd.c
+++ b/bd/lfs_rambd.c
@@ -7,6 +7,8 @@
  */
 #include "bd/lfs_rambd.h"

+#include "sanitizer/asan_interface.h"
+
 int lfs_rambd_create(const struct lfs_config *cfg,
         const struct lfs_rambd_config *bdcfg) {
     LFS_RAMBD_TRACE("lfs_rambd_create(%p {.context=%p, "
@@ -69,6 +71,21 @@ int lfs_rambd_read(const struct lfs_config *cfg, lfs_block_t block,
     memcpy(buffer, &bd->buffer[block*bd->cfg->erase_size + off], size);

     LFS_RAMBD_TRACE("lfs_rambd_read -> %d", 0);
+
+    if (block == 0 && off == 0 && size == 4) {
+        LFS_RAMBD_TRACE("Corrupted foo! -> %d", LFS_ERR_CORRUPT);
+        ASAN_POISON_MEMORY_REGION(buffer, size);
+        return LFS_ERR_CORRUPT;
+    }
+    if (block == 1 && off == 0 && size == 4) {
+        LFS_RAMBD_TRACE("Negative Serial -> %d", 0);
+        ASAN_UNPOISON_MEMORY_REGION(buffer, size);
+        ((uint8_t*)buffer)[3] |= 0x80;
+        return 0;
+    }
+
+    ASAN_UNPOISON_MEMORY_REGION(buffer, size);
+
     return 0;
 }

and compile a small demo program using that rambd driver with ASAN enabled:

BUILDDIR=$(pwd) gcc -fsanitize=address -g -DLFS_RAMBD_YES_TRACE -DLFS_YES_TRACE -I${BUILDDIR} lfs.c lfs_util.c bd/lfs_rambd.c -o thing thing.c

(thing.c is the example from the README plus the necessary structs to use lfs_rambd.c)

When running ./thing you get output like this:

bd/lfs_rambd.c:14:trace: lfs_rambd_create(0x55af2c77b160 {.context=0x55af2c77b5e0, .read=0x55af2c76dcbb, .prog=0x55af2c76e1ff, .erase=0x55af2c76e5fd, .sync=0x55af2c76e7d1}, 0x55af2c77b120 {.read_size=4, .prog_size=16, .erase_size=4096, .erase_count=1024, .buffer=(nil)})
bd/lfs_rambd.c:42:trace: lfs_rambd_create -> 0
lfs.c:5788:trace: lfs_mount(0x55af2c77b480, 0x55af2c77b160 {.context=0x55af2c77b5e0, .read=0x55af2c76dcbb, .prog=0x55af2c76e1ff, .erase=0x55af2c76e5fd, .sync=0x55af2c76e7d1, .read_size=4, .prog_size=16, .block_size=4096, .block_count=128, .block_cycles=500, .cache_size=16, .lookahead_size=16, .read_buffer=(nil), .prog_buffer=(nil), .lookahead_buffer=(nil), .name_max=0, .file_max=0, .attr_max=0})
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x0, 0, 0x7f14132001a0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:76:trace: Corrupted foo! -> -84
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 0, 0x7f14132001a4, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:81:trace: Negative Serial -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x0, 4, 0x602000000010, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 4, 0x602000000010, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
lfs.c:1346:error: Corrupted dir pair at {0x0, 0x1}
lfs.c:5807:trace: lfs_mount -> -84
lfs.c:5758:trace: lfs_format(0x55af2c77b480, 0x55af2c77b160 {.context=0x55af2c77b5e0, .read=0x55af2c76dcbb, .prog=0x55af2c76e1ff, .erase=0x55af2c76e5fd, .sync=0x55af2c76e7d1, .read_size=4, .prog_size=16, .block_size=4096, .block_count=128, .block_cycles=500, .cache_size=16, .lookahead_size=16, .read_buffer=(nil), .prog_buffer=(nil), .lookahead_buffer=(nil), .name_max=0, .file_max=0, .attr_max=0})
…
lfs.c:5777:trace: lfs_format -> 0
lfs.c:5788:trace: lfs_mount(0x55af2c77b480, 0x55af2c77b160 {.context=0x55af2c77b5e0, .read=0x55af2c76dcbb, .prog=0x55af2c76e1ff, .erase=0x55af2c76e5fd, .sync=0x55af2c76e7d1, .read_size=4, .prog_size=16, .block_size=4096, .block_count=128, .block_cycles=500, .cache_size=16, .lookahead_size=16, .read_buffer=(nil), .prog_buffer=(nil), .lookahead_buffer=(nil), .name_max=0, .file_max=0, .attr_max=0})
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x0, 0, 0x7f14132008a0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:76:trace: Corrupted foo! -> -84
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 0, 0x7f14132008a4, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:81:trace: Negative Serial -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 4, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 20, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 36, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 52, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 48, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 64, 0x6020000000d0, 16)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 56, 0x7f14130005e0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 44, 0x7f14130005e0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 16, 0x7f14130005e0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 20, 0x7f1413200760, 24)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 56, 0x7f1413000660, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 44, 0x7f1413000660, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 16, 0x7f1413000660, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 4, 0x7f1413000660, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
lfs.c:5807:trace: lfs_mount -> 0
lfs.c:5928:trace: lfs_file_open(0x55af2c77b480, 0x55af2c77b540, "boot_count", 103)
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x1, 0, 0x7f1413200ba0, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:81:trace: Negative Serial -> 0
bd/lfs_rambd.c:59:trace: lfs_rambd_read(0x55af2c77b160, 0x0, 0, 0x7f1413200ba4, 4)
bd/lfs_rambd.c:73:trace: lfs_rambd_read -> 0
bd/lfs_rambd.c:76:trace: Corrupted foo! -> -84
=================================================================
==3208565==ERROR: AddressSanitizer: stack-buffer-overflow on address 0x7f1413200ba4 at pc 0x55af2c74d35f bp 0x7fff642d1d30 sp 0x7fff642d1d20
READ of size 4 at 0x7f1413200ba4 thread T0
    #0 0x55af2c74d35e in lfs_dir_fetchmatch /home/user/littlefs/lfs.c:1094
    #1 0x55af2c750434 in lfs_dir_find /home/user/littlefs/lfs.c:1516
    #2 0x55af2c75b5c6 in lfs_file_rawopencfg /home/user/littlefs/lfs.c:3031
    #3 0x55af2c75c622 in lfs_file_rawopen /home/user/littlefs/lfs.c:3177
    #4 0x55af2c76be7d in lfs_file_open /home/user/littlefs/lfs.c:5932
    #5 0x55af2c76ea08 in main /home/user/littlefs/thing.c:74
    #6 0x7f14150280cf in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
    #7 0x7f1415028188 in __libc_start_main_impl ../csu/libc-start.c:360
    #8 0x55af2c7454c4 in _start (/home/user/littlefs/thing+0x54c4) (BuildId: c8fe4cbf8fab401c0cd61a4e7e3aa611af6a0221)

Address 0x7f1413200ba4 is located in stack of thread T0 at offset 164 in frame
    #0 0x55af2c74cfd9 in lfs_dir_fetchmatch /home/user/littlefs/lfs.c:1075

  This frame has 8 object(s):
    [32, 36) 'crc' (line 1126)
    [48, 52) 'tag' (line 1131)
    [64, 68) 'dcrc' (line 1161)
    [80, 84) 'fcrc_' (line 1309)
    [96, 104) 'fcrc' (line 1123)
    [128, 136) '<unknown>'
    [160, 168) 'revs' (line 1088) <== Memory access at offset 164 is inside this variable
    [192, 200) 'temptail' (line 1116)
HINT: this may be a false positive if your program uses some custom stack unwind mechanism, swapcontext or vfork
      (longjmp and C++ exceptions *are* supported)
SUMMARY: AddressSanitizer: stack-buffer-overflow /home/user/littlefs/lfs.c:1094 in lfs_dir_fetchmatch
Shadow bytes around the buggy address:
  0x7f1413200900: f1 f1 f1 f1 f1 f1 00 f2 f2 f2 00 00 00 00 00 00
  0x7f1413200980: f3 f3 f3 f3 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f1413200a00: f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5 f5
  0x7f1413200a80: f5 f5 f5 f5 f5 f5 f5 f5 00 00 00 00 00 00 00 00
  0x7f1413200b00: f1 f1 f1 f1 04 f2 04 f2 04 f2 04 f2 00 f2 f2 f2
=>0x7f1413200b80: 00 f2 f2 f2[04]f2 f2 f2 00 f3 f3 f3 00 00 00 00
  0x7f1413200c00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f1413200c80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f1413200d00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f1413200d80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
  0x7f1413200e00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Shadow byte legend (one shadow byte represents 8 application bytes):
  Addressable:           00
  Partially addressable: 01 02 03 04 05 06 07 
  Heap left redzone:       fa
  Freed heap region:       fd
  Stack left redzone:      f1
  Stack mid redzone:       f2
  Stack right redzone:     f3
  Stack after return:      f5
  Stack use after scope:   f8
  Global redzone:          f9
  Global init order:       f6
  Poisoned by user:        f7
  Container overflow:      fc
  Array cookie:            ac
  Intra object redzone:    bb
  ASan internal:           fe
  Left alloca redzone:     ca
  Right alloca redzone:    cb
==3208565==ABORTING

There are two issues in littlefs demonstrated by this PoC:

There is a chance, that certain code paths read memory despite the block layer returning that memory to be corrupted
When only the second block of a dir_pair is readable AND the revision in that dir_pair is negative, the sector is ignored, despite being the only usable one.

The first issue is a problem with how memory is handled when a sector could not be properly read.

The second issue is a bug in how the most recent revision is determined in lfs_dir_fetchmatch:

    // find the block with the most recent revision
    uint32_t revs[2] = {0, 0};
    int r = 0;
    for (int i = 0; i < 2; i++) {
        int err = lfs_bd_read(lfs,
                NULL, &lfs->rcache, sizeof(revs[i]),
                pair[i], 0, &revs[i], sizeof(revs[i]));
        revs[i] = lfs_fromle32(revs[i]); // Invalid read, should go AFTER the error check
        if (err && err != LFS_ERR_CORRUPT) {
            return err;
        }

        // \/--- Should ALWAYS be true for the first (valid) read we encounter
        if (err != LFS_ERR_CORRUPT &&
                lfs_scmp(revs[i], revs[(i+1)%2]) > 0) {
            r = i;
        }
    }

If you have any questions or need more information, feel free to ask away. :)

thing.c

```c #include "lfs.h" #include "bd/lfs_rambd.h" // variables used by the filesystem lfs_t lfs; lfs_file_t file; struct lfs_rambd_config lfs_rambd_cfg = { // Minimum size of a read operation in bytes. .read_size = 4, // Minimum size of a program operation in bytes. .prog_size = 16, // Size of an erase operation in bytes. .erase_size = 4096, // Number of erase blocks on the device. .erase_count = 1024, // Optional statically allocated buffer for the block device. .buffer = NULL, }; lfs_rambd_t lfs_rambd; // configuration of the filesystem is provided by this struct struct lfs_config cfg = { .context = &lfs_rambd, // block device operations .read = lfs_rambd_read, .prog = lfs_rambd_prog, .erase = lfs_rambd_erase, .sync = lfs_rambd_sync, // block device configuration .read_size = 4, .prog_size = 16, .block_size = 4096, .block_count = 128, .cache_size = 16, .lookahead_size = 16, .block_cycles = 500, }; // entry point int main(void) { if (lfs_rambd_create(&cfg, &lfs_rambd_cfg) != 0) { puts("could not create rambd"); exit(1); } // mount the filesystem int err = lfs_mount(&lfs, &cfg); // reformat if we can't mount the filesystem // this should only happen on the first boot if (err) { lfs_format(&lfs, &cfg); lfs_mount(&lfs, &cfg); } // read current count uint32_t boot_count = 0; lfs_file_open(&lfs, &file, "boot_count", LFS_O_RDWR | LFS_O_CREAT); lfs_file_read(&lfs, &file, &boot_count, sizeof(boot_count)); // update boot count boot_count += 1; lfs_file_rewind(&lfs, &file); lfs_file_write(&lfs, &file, &boot_count, sizeof(boot_count)); // remember the storage is not updated until the file is closed successfully lfs_file_close(&lfs, &file); // release any resources we were using lfs_unmount(&lfs); // print the boot count printf("boot_count: %d\n", boot_count); lfs_rambd_destroy(&cfg); } ```

geky commented 9 months ago

Hi @BenBE, thanks for creating an issue.

This is quite an interesting use of ASAN.

I'd be concerned that littlefs's view of the world (uninitialized memory is random but static) vs C/ASAN's view (uninitialized memory is bad bad bad bad bad) would lead to false positives, but this looks like it found a real issue.

Also thanks for posting detailed steps to reproduce.

There is a chance, that certain code paths read memory despite the block layer returning that memory to be corrupted

I think this is a non-issue. We do the le32 conversion before checking for an error, but we don't actually use the value after that.

The reason for the weird before-error conversion is because we do the same pattern for writes. For writes, the words are sometimes in-use by internal structs. We need to do le32 conversion to/from for these words before error-checking or else in-RAM state can become corrupted.

It's all a bit of a hack but fortunately we don't need to do this for reads, so while I think this le32 conversion has no impact, we can move the le32 conversion to after the error check to improve static analysis.

Though we may need to do this in more places than this.

When only the second block of a dir_pair is readable AND the revision in that dir_pair is negative, the sector is ignored, despite being the only usable one.

This looks like a legitimate bug. Revision counts should be compared with sequence arithmetic so "negative" values are still valid.

I think this hasn't been hit until now because most triggers of LFS_ERR_CORRUPT don't write to the buffer, leaving the revision count as 0. By the time this overflows no corrupted blocks in the metadata pair remain, though this can still be hit by a block device driver that writes arbitrary values to the buffer on LFS_ERR_CORRUPT.

I'll try to get a fix into the next patch release.

BenBE commented 9 months ago

There is a chance, that certain code paths read memory despite the block layer returning that memory to be corrupted

I think this is a non-issue. We do the le32 conversion before checking for an error, but we don't actually use the value after that.

Based on my experience, it's often better to delay processing information to the point when you know the information is valid. Else you end up with a kind of quantum state, where your "converted" data is both valid and invalid at the same time, until you performed your check. Reducing that time as much as sensible often allows for other debugging measures (like ASAN) to be much better with finding issues*.

I can understand that you should do some repeating patterns for similar checks as this eases locating issues. Preferably the patterns you use should be both consistent and correct. With the PoC for this issue, we toyed with this assumption to force things to go south quickly.

*There's actually a limitation in the way ASAN marks memory as readable/unreadable, which effectively caused the error to be reported later than it should have (the first call to lfs_rawmount is already broken, but the report is somewhere in lfs_dir_fetchmatch through lfs_rawopen). The reason being that ASAN can't fully check each byte for access, but suffixes inside 8-byte groups.

2. When only the second block of a dir_pair is readable AND the revision in that dir_pair is negative, the sector is ignored, despite being the only usable one.

This looks like a legitimate bug. Revision counts should be compared with sequence arithmetic so "negative" values are still valid.

I think this hasn't been hit until now because most triggers of LFS_ERR_CORRUPT don't write to the buffer, leaving the revision count as 0.

The issue is subtly different even. The PoC does not write to the buffer on LFS_ERR_CORRUPT (only marks the buffer as poisoned for ASAN), but forces the valid part of the pair to have a negative sequence number. The bug that is triggered in this case is that the sequence number starts out as 0 and is kept there after the invalid first block is read (correct so far). When the second (valid) block is read, a negative sequence number is found and compared to that initial value of 0 (<-- bug here). This comparison fails, thus that block is not considered any further, despite being (the first) valid one. The initial index for the block to use is also set to 0 (the first block) and should have been updated to 1, as the only valid block is the second one. This causes the subsequent code to access the (uninitialized) data of the first block to be processed instead of the information of the (properly read) second block.

Additionally there's a check missing if none of the blocks could be properly read (i.e. all returned LFS_ERR_CORRUPT).

I hope this clarifies the situation with this bug a bit more. Especially as the "invalid read" is not with the le32 conversion (although that's usually a code smell at least), but below that loop when further processing is done.

Edit: The issue was spotted during (offline) code review while examining if you can cause undefined behaviour in the loop that reads the pairs; ASAN was only used to demonstrate its presence in the PoC, as that gives a clear "crash". The issue itself normally would only cause silent corruption, as all the memory accessed is from littlefs; thus the abuse of ASAN's memory poisoning to mark the buffers as invalid.

geky commented 9 months ago

I was trying to reproduce locally and was having difficulty. I think this still can't result in a failed fetch.

Even if we pick up the wrong revision count, the later checks for checksum/tag validity will still toss out the corrupted block and fallback to fetching the correct block (here).

Which makes sense, the revision count can be a blatantly wrong value if powerloss caused a commit to only be partially written.

littlefs-project / littlefs

Access to buffers returned as corrupted by the block layer #904