multiformats / cid

Self-describing content-addressed identifiers for distributed systems
Other
426 stars 79 forks source link

Redefine CID version to be multicodec #40

Closed oed closed 4 years ago

oed commented 4 years ago

Not sure how to best go about updating the Decoding algorithm:

   * Otherwise, let `N` be the first varint in `cid`. This is the CID's version.
     * If `N == 1` (CIDv1):
       * The CID's multicodec is the second varint in `cid`
       * The CID's multihash is the rest of the `cid` (after the second varint).
       * The CID's version is 1.
     * If `N <= 0`, the CID is malformed.
     * If `N > 1`, the CID version is reserved.

Should it just try to match the first bytes with the CID version. I guess we need to account for the possiblity that future CID versions might be larger?

Fixes https://github.com/multiformats/cid/issues/25

Stebalien commented 4 years ago

Well, the encoding of any given CID will depend on the version, so the decoding algorithm would have to be:

  1. Check if it starts with the sha256 multicodec. If so, it's a CIDv0.
  2. Check if it starts with the cidv1 multicodec, if so, it's a CIDv1. ... how to parse a cidv1 ...
  3. Otherwise, it's something else (possibly CIDv2).

It would also be nice to add some motivation around why we're using multicodecs. Specifically, it makes CIDs fully self describing and unambiguous in all contexts where multiformats can be used.

oed commented 4 years ago

Made a small update to the algorithm which I think should be sufficient.