littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
5.1k stars 788 forks source link

Checksum(CRC32) comparisons against special values. #532

Open gtaska opened 3 years ago

gtaska commented 3 years ago

It is important to only ever compare checksums against each other, and never against fixed values.

It is possible for calculated checksums to have any value, including values like 0, 0x7FFFFFFF and 0xFFFFFFFF, so it is important to never assign any special meaning to a them.

See an example here.

BenBE commented 3 years ago

There are situations where you can expect certain CRC values. One such situation is when you calculate the CRC over a block with the block's CRC at the end. In that case the overall CRC will be defined based on how the CRC is defined (usually 0 or -1). But apart from that you're right with this.

gtaska commented 3 years ago

True, but that should be stated as such - and ideally calculated at startup. This embeds assumptions about the results about the checksum algorithm, and makes it very hard to update/change it later.

Ideally the code would be written in such a way that the checksum algorithm can be replaced, with no additional tweaking in the rest of the code.

As an example - while investigating a performance issue, I tried to replace the checksum generation with a hardware-calculated one, but it failed due these additional assumptions dotted through the code. (There are numerous crc32 variants, and not all of them produce 0 or -1 outputs for the cases required)

geky commented 3 years ago

As @BenBE notes, the CRC function we use in LittleFS is well defined when it includes itself at the end (crc(x ++ crc(x)) == 0). Though it is a valid point this restricts LittleFS to checksums where this property holds.

Ideally the code would be written in such a way that the checksum algorithm can be replaced, with no additional tweaking in the rest of the code.

Unfortunately, changing the checksum this way breaks portability. It's probably fine for most embedded applications, but would break any external tooling.

It wouldn't be too hard to add a checksum-type field, though I haven't ran into a compelling reason yet. I'm curious, did you end up needing to use a different checksum?

See an example here.

This one is a bit annoying to replace since it's actually looping over multiple checksums. But I actually have a branch I'm working on that refactors this quite a bit, so it should be easy to adopt there.

gtaska commented 3 years ago

No, I didn't end up switching checksums - as far as I could tell, it ended up not being a significant bottleneck on my platform.

At the very least, please document any assumptions made by the code about the checksum results.

Have you considered the possibility of replacing checksums with ECC, allowing the flexibility of fixing a bit flip or two?

geky commented 3 years ago

I've looked into using ECC before, but so far haven't found an algorithm that's a great fit. ECC is much more expensive and the types used on disks are designed to work with fixed block sizes. It also raises the question of how many bit-flips per byte is expected, though this could be a configurable value.

At the moment it's possible to use ECC in the block device layer, where I think it is a better fit. (And a number of NAND devices already provide ECC on the disk side).


One case I was considering for alternate checksums was to allow cryptographic hashes, in the case you want to create a unique identifier for the state of the filesystem (currently working on a full-filesystem hash). Though I'm not sure what interest there would be in such a feature.