multiformats / cid

Self-describing content-addressed identifiers for distributed systems
Other
426 stars 79 forks source link

"identity" multibase CIDv1 binary #39

Closed aratz-lasa closed 4 years ago

aratz-lasa commented 4 years ago

Given a binary CID cid, where N is the first varint in cid. If N == 0, could not be a CIDv1 encoded as identity multibase? I ask it because, in dag-cbor is specified that CIDv1 should be encoded as identity multibase. However, following the Decoding Algorithm, it would raise an error instead of decoding it.

So, should the decoding algorithm be modified, or CID does not natively support to encode/decode CIDv1 using the raw-binary identity Multibase?

Stebalien commented 4 years ago

We may need to improve the documentation a bit here...

The TL;DR is, CIDs in DagCBOR are encoded as \0-<binary-cid-minus-the-multibase> (where \0 is NULL or the 0 byte). To decode, strip the leading 0, then decode as a binary CID.

Using the "identity multibase" prefix in DagCBOR is mostly an historical mistake. I call it a mistake because we prepend the "identity multibase" to CIDv0 CIDs as well, while CIDv0 doesn't use multibase.

aratz-lasa commented 4 years ago

Thank you @Stebalien for the explanation.

Just to confirm. Then, DagCBOR CID encoding/decoding should be done "manually" (adding/removing the 0 byte, and later using the CID decoder), instead of directly using the CID library right?

If that is the case, it means CID libraries, such as py-cid (which actually cannot decode "identity multibase CIDv1) are not expected to be able to decode "identity multibase" encoded CIDv1?

Stebalien commented 4 years ago

Just to confirm. Then, DagCBOR CID encoding/decoding should be done "manually" (adding/removing the 0 byte, and later using the CID decoder), instead of directly using the CID library right?

Yes. To encode, prepend the "0" byte to a binary CID. To decode, remove it.

If that is the case, it means CID libraries, such as py-cid (which actually cannot decode "identity multibase CIDv1) are not expected to be able to decode "identity multibase" encoded CIDv1?

You should call from_bytes and cid.buffer().

aratz-lasa commented 4 years ago

You should call from_bytes and cid.buffer().

Okay, now it is everything clear. I will need to do a PR to change it, because right now it does not decode it. Thanks for your help! (and sorry for the late response)