littlefs-project / littlefs

A little fail-safe filesystem designed for microcontrollers
BSD 3-Clause "New" or "Revised" License
4.92k stars 774 forks source link

LFS_TYPE_FROM enum type never used in source code #882

Closed alofthouse closed 8 months ago

alofthouse commented 8 months ago

Hi, I'm trying to understand the internal workings of littlefs (which has been made a lot easier by the great documentation in the DESIGN + SPEC docs) and I came across the LFS_TYPE_FROM enum value in lfs_type.

At least on v2.8.0, this isn't used anywhere in the source code though. There is a reference to 0x100 in lfs_dir_traverse_filter() with the comment:

which mask depends on unique bit in tag structure

but that suggests this is being used in a way similar to the valid bit instead of the other type identifiers perhaps?

Thanks 😄

geky commented 8 months ago

Hi @alofthouse, good question.

I can see how LFS_TYPE_FROM missing from SPEC.md would be confusing. The reason for this is because LFS_TYPE_FROM should never actually be written to disk.

SPEC.md is only concerned with how the disk is organized, and another hypothetical driver is free to ignore LFS_TYPE_FROM completely.

Disk aside, in this driver it's useful to have a couple internal-only tags that tell lfs_dir_commit to do something a bit more complex than write a tag:

You may have noticed these types tend to follow some patterns, with the top 3-bits indicating a sort of "supertype", and the lower 8-bits indicating a "subtype". For example: 0x2xx is the supertype for all file structs, and 0x201, the "inlinestruct", is just one specific representation.

Since we never use the 0x1xx supertype on disk, it is free to be used in the driver to represent any internal-only tags without risk of conflicts. At least in the current version of littlefs.

In theory, the LFS_TYPE_FROM supertype could be used with a mask to filter for all internal-only tags, but in practice we haven't needed to do that. Because we don't use it, LFS_TYPE_FROM could be removed, but I guess it at least it serves as an internal hint that that range of tags is reserved.


which mask depends on unique bit in tag structure

but that suggests this is being used in a way similar to the valid bit instead of the other type identifiers perhaps?

Ah, this is a red-herring. And a subtle (hacky?) part of the tag encoding that is a bit tricky.

A question came up during development: When do two tags represent the same file attribute?

Clearly 0x301 (user attr 0x01) and 0x302 (user attr 0x02) are different attributes. But 0x600 (soft tail) and 0x601 (hard tail) are really the same attribute with a different flag indicating if the tail is soft or hard. If you append a hard tail to a metadata log, it should effectively replace a previous soft tail, not end up with both a soft and hard tail.

So the supertypes of the tags were arranges in such a way that the lowest bit indicates if the subtype, the lowest 8-bits, should be included when comparing tags, or ignored. This is why it changes the mask during dir filter. This is also some of why the tag encoding may seem a bit random:

0x4xx  LFS_TYPE_SPLICE    - Don't care, splice tags are ignored in dir filter
0x0xx  LFS_TYPE_NAME      - Don't care, a file should never have more than one name
0x2xx  LFS_TYPE_STRUCT    - Subtype should be ignored, indicates struct encoding
0x3xx  LFS_TYPE_USERATTR  - Subtype is unique, indicates which user attr
0x6xx  LFS_TYPE_TAIL      - Subtype should be ignored, indicates hard vs soft tails
0x7xx  LFS_TYPE_GSTATE    - Subtype is unique, indicates gstate type
0x5xx  LFS_TYPE_CRC       - Don't care, CRC tags are ignored in dir filter

Is this a clever solution? An ugly hack? That's up to you to decide : )

alofthouse commented 8 months ago

Thanks very much for that explanation @geky, that all makes perfect sense.

And to answer your final question, I think it's very much on the elegant side of things to use every bit available to save RAM 😉