ipld / go-car

A content addressible archive utility
Other
151 stars 44 forks source link

Present codebase does not allow opening root-less .car files #26

Open ribasushi opened 4 years ago

ribasushi commented 4 years ago

https://github.com/ipld/go-car/blob/v0.0.4/car.go#L120-L122 seems somewhat superfluous. In the tests for https://github.com/ipfs/go-ipfs/pull/7011/ I worked around this by sticking an "empty CID" \x01\x55\x00\x00 into the root array, but there got to be a better way...?

If there is compelling argument to not accepting root-less .car files: please clarify it in this issue.

ribasushi commented 4 years ago

@whyrusleeping @mikeal let me know your thoughts on just removing this limitation ( strictly on read )

rvagg commented 4 years ago

https://github.com/ipld/specs/blob/master/block-layer/content-addressable-archives.md#number-of-roots

I allowed for rootless in the JavaScript version, but the problem we have now is that there's existing code in the wild that assumes it's receiving a root because of that line of code, so zero-roots disallowed is at the level of an almost-spec feature and maybe should be considered locked in for CARv1.

ribasushi commented 4 years ago

@rvagg fair enough. Do you have thoughts on

I worked around this by sticking an "empty CID" \x01\x55\x00\x00 into the root array

?

rvagg commented 4 years ago

That CID isn't valid in JavaScript unfortunately, there's a mismatch between go-multihash and js-multihash.

> new CID(Buffer.from([0x01, 0x55, 0x00, 0x00]))
Uncaught Error: multihash too short. must be > 3 bytes.

Maybe [0x01, 0x55, 0x00, 0x01, 0x00]? a length=1 identity multihash using raw? I'll put in a PR to js-multihash to see whether this is something that should be done there or go-multihash. This isn't going to propagate quickly, though, so your shorter version might not be safe for a while.

What's the use-case with zero-root CARs here. Is this just for constructing test fixtures? Or is it for something that would be used in production? I'm sure you've gone over this before but it's not stuck in my head, sorry.

Maybe it's fine to just adopt this as a "whatever works" approach for CARv1 and get this fixed in CARv2 where we could explicitly state that zero roots is OK. It could even go in as a recommendation for CARv1 if you want zero roots.

ribasushi commented 4 years ago

No worries, lots of concepts flying around, keeping track of all is HARD.

The use case for root-less cars is streaming out a dag into a car. You can not possibly know the root of that dag until the very end.

new CID(Buffer.from([0x01, 0x55, 0x00, 0x00])) Uncaught Error: multihash too short. must be > 3 bytes.

That looks like an outright bug and should be fixed regardless

rvagg commented 4 years ago

https://github.com/multiformats/js-multihash/pull/75

rvagg commented 4 years ago

discussion about zero roots "hack" vs "fix": https://github.com/ipld/specs/pull/255

rvagg commented 4 years ago

Fixed and released already!

> new CID(Buffer.from([0x01, 0x55, 0x00, 0x00])) 
CID(bafkqaaa)

Still unsure if this is a great idea or not. Starting to lean toward removing the zero-roots restriction here instead now.