Suspicious comparison at `lfs_dir_find_match()`

andriyndev commented 5 months ago

The function uses lfs_bd_cmp() which, from what I understand, compares data on disk with an arbitrary data passed as argument, and returns:

LFS_CMP_LT - the data on disk is less than the arbitrary data
LFS_CMP_GT - the data on disk is greater than the arbitrary data
LFS_CMP_EQ - the data on disk and the arbitrary data are equal

If the function returns LFS_CMP_EQ, we perform an additional comparison:

    if (name->size != lfs_tag_size(tag)) {
        return (name->size < lfs_tag_size(tag)) ? LFS_CMP_LT : LFS_CMP_GT;
    }

This comparison looks suspicious because it returns LFS_CMP_LT is the arbitrary data (not the data on disk) is shorter than the data on disk. Maybe it was intended this way but just in case decided to report.

geky commented 5 months ago

Oh wow, looks like you're right. Wild that this hasn't been caught until now.

I guess it helped that the fact that littlefs's directories are sorted hasn't been the best documented. Other filesystems may have wildly different orderings.

We're also lucky none of littlefs's internals currently require sorted order. Otherwise we'd probably be stuck with this weird ordering for backwards compatibility reasons...

geky commented 5 months ago

Went ahead and created a PR here: https://github.com/littlefs-project/littlefs/pull/926

Thanks for reporting!

andriyndev commented 5 months ago

Are you sure it's still safe to change the sorting algorithm? From what I understand, on the existing filesystems files are already sorted on disk according to the old algorithm, and even after the fix they'll remain sorted according to it, and further files operations will lead to mixing when old files will remain sorted according to the old algorithm, while new and newly modified files will be sorted according to the new algorithm. Cannot it lead to some nasty bugs like inability to find an existing file because of wrong sorting (if it's taken into account during search). But if currently littlefs doesn't require sorted order at all, won't implementing such a feature lead to breaking the old filesystems where the incorrect ordering of at least some files persists.

andriyndev commented 5 months ago

As an option - apply the fix to the next major version, and fix the ordering of existing files through a migration process.

geky commented 5 months ago

This is a valid concern. This change definitely deserves caution.

I was also too cavalier earlier. I thought fixing this would not be a problem since littlefs does not rely on ordering internally, ordering is more a convenience provided to users. But further investigation indicates this will be a problem a problem with the code as-is as we do rely on ordering for early loop termination[1].

It may still be possible to fix this on a disk-minor-version[2]:

Change directory lookup to not terminate early. File creation would then need to two passes, but runtime complexity would not change $O(n) \rightarrow O(n)$.

This would allow littlefs to open files in directories of any ordering.
Bump the on-disk minor version lfs2.1 -> lfs2.2. This has happened before, but has caused migration pains for users. Maybe a second bump would be less of a pain due to learning from the first bump?

This would prevent breaking of old filesystems because old filesystems would reject the newer on-disk minor version. Though you can probably see how this can cause other headaches.

I realized after writing this that I think this is basically the "migration process" you mentioned.

I guess the question is, is this worth it? This may end up needing more user feedback.

[1] It's a bit indirect, ok it's very indirect, but:
1. lfs_dir_fetchmatch keeps track of the smallest id that compares LFS_CMP_GT: https://github.com/littlefs-project/littlefs/blob/3513ff1afc1d67adb2e6f492f0b9bc0d798fcb0d/lfs.c#L1275-L1279
2. At the end of lfs_dir_fetchmatch it returns LFS_ERR_NOENT if any LFS_CMP_GT ids were found: https://github.com/littlefs-project/littlefs/blob/3513ff1afc1d67adb2e6f492f0b9bc0d798fcb0d/lfs.c#L1337-L1340
3. lfs_dir_find terminates early on LFS_ERR_NOENT, while lfs_dir_fetchmatch updates both the mdir and the id. Since we terminate early, lfs_dir_find may miss later entries if the ordering changes: https://github.com/littlefs-project/littlefs/blob/3513ff1afc1d67adb2e6f492f0b9bc0d798fcb0d/lfs.c#L1523-L1525
[2] littlefs has a few more release options than other libraries because, as you noticed, storage is unforgiving:
- disk-major-version - Breaking on-disk changes
- disk-minor-version - Compatible on-disk changes, may be irreversible
- driver-major-version - Breaking API changes
- driver-minor version - Compatible API changes
- driver-patch-version - No API changes
Disk-major-versions are very disruptive, and unless we have other major changes this fix would not be worth a disk-major-version on its own.

andriyndev commented 5 months ago

OK. I think the fix should be postponed to the next major disk version, and be applied together with them.

littlefs-project / littlefs

Suspicious comparison at `lfs_dir_find_match()` #923