Closed ukstv closed 2 years ago
Sorry, but undefined
should never have been allowed and was only possible because it used https://github.com/dignifiedquire/borc which would even let you encode Dates, and its own custom BigNumber objects as well. The IPLD data model has a strict set of kinds that work throughout systems that support IPLD and undefined
will break in many places. Your existing data with undefined
is going to break if you pass it through Go IPLD / IPFS code for instance. https://ipld.io/docs/data-model/kinds/
You know, SemVer was created for a reason.
Snark isn't particularly helpful. And while I agree and would prefer that js-ipfs be >1 by now so we can do proper signalling, that's a decision by others and they are currently using <0 semver semantics which do allow for breaking changes in minor release. So technically it is following semver.
Could you lift the restrictions for undefined, maybe, so that old IPFS records could be properly decoded still?
You're welcome to do this now by using the BLOCK API to fetch the raw bytes and use the older DAG-CBOR codec or just use a CBOR decoder directly (using borc) and deal with the CID tags yourself. If you really wanted to you could even fork this repo and replace the CBOR decoder with borc to make it behave how you want and then force js-ipfs to use it instead of this codec whenever it encounters a DAG-CBOR block to decode (I believe you can override existing codecs when you supply new ones that conflict). In fact, it's probably easier than that since you may be able to just undo https://github.com/ipld/js-dag-cbor/blob/master/index.js#L88 and the undefined
s will flow freely.
It's really important that you ensure that new data doesn't get encoded with undefined
, it'll be treated as a bad block by the newer JS stack and the Go stack is the same, and since Go runs most of the backend infra around the world for IPFS, that's bad news: https://github.com/polydawn/refmt/blob/30ac6d18308e584ca6a2e74ba81475559db94c5f/cbor/cborDecoder.go#L213-L218
Sorry for the annoyance on this, clarifying the boundaries of our codecs and ensuring fully compatibility across the stack has been a long journey. Hopefully the current level of clarity helps, though.
@rvagg Reference to go-ipfs is misleading in this case. Despite the linked change is 3 years old, go-ipfs 0.9 is able to return an old cbor too, unlike go-ipfs 0.10. Maybe this repository is not the right venue to discuss this, but anyway with the recent changes both go-ipfs and js-ipfs can not load old blocks.
I could blame own negligence on not following IPLD spec, but changes to the spec making undefined
prohibited are also recent. On 2020-10-14 it became prohibited, on 2021-01-28 the old codec became "legacy", and at that time there were no proper codec available.
Hey @rvagg this is a big problem for us as it basically breaks all existing data on Ceramic.
It's really important that you ensure that new data doesn't get encoded with
undefined
This makes sense and we will definitely do it going forward, but we need a solution for loading existing data. Does it really hurt to keep decode
functions that allow for undefined
to allow for loading of data that has already been created with older versions?
We also found that this strictness is also enforced on the go-ipfs side with the new go-ipld-prime, while it works in go-ipfs < 0.10.
Hey, so a few of us have been furiously discussing this issue and the merits and technical issues involved. One outstanding question has been the Go stack and whether it's ever been able to properly recognise undefined
. As I pointed out above, refmt isn't happy about undefined
by default so my assumption has been that trying to get go-ipfs to decode one of these blocks is going to end up with pain, either with current versions or historical versions.
But I decided to dig deeper because you've insisted that this isn't correct and we've been doing some head scratching. So here's what I've found, and it looks like we're in a dejavu situation from almost exactly 4 years ago:
undefined
and any of the other possible strange values that can come in via that route, effectively treating them as unset values (likely nil
or whatever the unset value of the field would be, although I haven't spent much time figuring this bit out): https://github.com/whyrusleeping/cbor/blob/63513f603b11583741970c5045ea567130ddb492/go/cbor.go#L408-L438undefined
among other inappropriate terminals) no longer works with go-ipfs, because undefined
is now rejected as an invalid token.So it looks like we're repeating 2018, but now with strictness on both JS and Go sides.
We still need to retain strictness on encode for this to all work nicely, though, so hopefully Ceramic, and anyone else who happens to be relying on undefined
can find a work-around for existing code that may rely on it.
Thanks for digging into this and finding a path forward @rvagg extremely appreciated! Let us know if we can help in any way on the js side, since the go side has a fix (yet to be confirmed to work?).
We still need to retain strictness on encode for this to all work nicely, though, so hopefully Ceramic, and anyone else who happens to be relying on
undefined
can find a work-around for existing code that may rely on it.
Definitely, no problem for us to not use undefined
on encode. We just need a way to be able to retrieve (and pin) old dag-cbor objects.
https://github.com/ipld/js-dag-cbor/pull/45 WIP, it's slightly more complicated than on the Go side which had the "coerce" flag ready to go.
We at Ceramic used the legacy codec to store data, when it was the only option. Apparently, some of the data contain
undefined
. New js-ipfs release can retrieve a block, but can not decode it, asundefined
is not allowed anymore in code. IMO, it is a bug to so drastically change the code and not make it visible for a general audience. You know, SemVer was created for a reason.As ipfs is released with this dag-cbor codec which is more strict than the legacy dag-cbor codec, I am wondering, what is the plan to accomodate for old blocks created by the legacy codec? Could you lift the restrictions for
undefined
, maybe, so that old IPFS records could be properly decoded still?