ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

unixfsv2: new proposal from @warpfork #295

Closed mikeal closed 3 years ago

mikeal commented 3 years ago

Adapted @warpfork’s proposal to a valid schema. Source: https://gist.github.com/warpfork/121a9f89c2a9ca1c642c27ae4101686e

mikeal commented 3 years ago

We need to make Attribs a link if we want to be able to manage the HAMT leaf node size in a reasonable way. Thoughts @warpfork?

warpfork commented 3 years ago

IMO we should expect/aim to see less than a halfdozen variations of the Attribs structure evolve, and all of them should be small.

"small" meaning: structs (finite size, the keys are pre-specified, etc); and (perhaps surprisingly) none of the values are string nor bytes (the two scalars that can get big), and certainly none of them are maps nor lists (which also obviously get big).

This is actually surprisingly feasible. The currently proposed ones all "coincidentally" fit this pattern. (It's almost like posix filesystems were also under evolutionary pressure to have predictable sizes for some kind of "block"^W "sector" alignment predictability purposes...)

Does this mean things like linux's "xattrs" aren't fitting in my defn of Attribs here? Yes, absolutely. (Those are a {String:String}.) If someone does need a variation of unixfsv2 that supports those? Cool: yeah, they'll also need to make a link somewhere in the vicinity to make the block size controllable and sane.

warpfork commented 3 years ago

Alternatively... Attribs might be perfectly reasonable to turn into a link. It's likely to be a highly repeated subunit (or at least, any of the variations of Attribs that don't have an "mtime" property will be!). And if it dedups a lot, arranging the link/block edges to make that apparent could be kinda neat.

mikeal commented 3 years ago

If we allow for inline attributes we’ll need to recommend bucket sizes for each Attribute schema so that they can reliably fit inside the leaf nodes without going over the max block size.

warpfork commented 3 years ago

Although we merged this, I really want to follow up further on this. The gist this PR drew content from has a lot more content that I think is still relevant. In particular, the gist spends a fair amount of energy making it clear that we want to build in the direction of "a heap of reusable components" rather than a singular monolithic no-deviation-allowed spec for unixfsv2, and I really want to make that clear in our specs in this repo as well.