ipld / go-car

A content addressible archive utility
Other
149 stars 44 forks source link

fix: BlockMetadata#Offset should be for section, not block data #497

Closed rvagg closed 1 year ago

rvagg commented 1 year ago

@willscott while the CARv2 offsets were completely broken prior to #491 and I made it match for both CARv1 and CARv2; I changed the meaning of "Offset" in the process, making it the data offset, not section offset. Unfortunately because there were no tests or docs in there the intention wasn't clear so I made an assumption. I just discovered this after updating go-car in vanilla Boost and getting the section read errors ("bad cid version").

So this is a bugfix, to change it back to section offset but also make both CARv1 and CARv2 work (+ docs + tests).

rvagg commented 1 year ago

https://github.com/ipld/go-car/commit/09d23db97574f786c8ff5df7aba3cfa75fee261e#diff-fc451b22af005ef3ae74c133e1c434be409f228975d20ade449c92eeb3341c3aL219 Size appears to be as intended, the length of the block data, not section; which is helpful because with Size and Cid you can derive the length of the section too. So I think we're good with this.

rvagg commented 1 year ago

Just tried this commit in boost and what I was tinkering with is passing now.

In the current boost, Size is only used for BlockstoreGetSize which is the right number. There's no LimitReader at the moment for reading the section, it just seeks to the offset and does a ReadNode to slurp in the section.

I'm thinking that a follow-up could add a bunch more information - rather than having to derive the block data offset from cid length + varint(cid length + block length) you could just have it.

But maybe see what we need in Boost first? This might make it simpler over there because there's already some calculation going on - in one of my changes to your branch I introduced a section offset calculation based on the offset, that can go away now.

willscott commented 1 year ago

agree - happy to wait and see if we need more from the boost perspective