multiformats / multicodec

Compact self-describing codecs. Save space by using predefined multicodec tables.
MIT License
334 stars 200 forks source link

Add multidid code #304

Closed oed closed 1 year ago

oed commented 1 year ago

This PR adds the 0x0d code for representing a multidid.

Picked a quite low number, which seems reasonable given the possible future prevalence of DIDs.

rvagg commented 1 year ago

Would like @expede, maybe @bumblefudge and others? to weigh in on this because I'm not as active in this space. 0x0d is <=127 so is single byte representation which is precious, like gold, so this has to be really worth it to take up space there.

@oed would it make sense to reserve higher until we can justify moving to a lower one? Is that a model that might work here instead of going off a hunch that "DIDs are the future" (which there seems to be significant scepticism about).

oed commented 1 year ago

until we can justify moving to a lower one?

What's the criteria for this? As far as I can tell DIDs are as adopted, if not more adopted, than CIDs.

rvagg commented 1 year ago

There's no clear criteria other than "there's only 75 of these left, let's not try to waste them on things that aren't going to see much use or that won't really benefit from just having a single byte".

DIDs are as adopted, if not more adopted, than CIDs

but we're not talking about DIDs as a whole, we're talking about DIDs combined with a multicodec code, and even more specifically than that—DIDs used within the multidid model; so it depends on adoption of that.

How about this—can you speak to the benefit of having a single byte for this? How much does a super-compact representation matter for this use case? I think, from a quick read of the draft, that you're going to stack codes together at the front of these things, so I imagine the desire here is to minimise all those leading bytes since you're likely going to have the second varint be >=2 bytes anyway?

rvagg commented 1 year ago

also, how can you resist 0x0d1d?

oed commented 1 year ago

but we're not talking about DIDs as a whole, we're talking about DIDs combined with a multicodec code, and even more specifically than that—DIDs used within the multidid model; so it depends on adoption of that.

Yes, good point. For context, the place in which multidid is going to be used is the IPLD representation of UCAN and SIWE which are the two main object-capability systems in IPLD land. Afaik, there's not specific alternative to DIDs being considered for either of these formats. This is the reason I think it will end up getting quite a large amount of usage, as both are starting to get a fair amount of traction across the web3 ecosystem.

rvagg commented 1 year ago

OK, so we don't have clear heuristics for saying no to single-byte values, and even though my personal default is no unless it's something very common, or very likely to be very common (perhaps a new codec that does all the magical things that everyone loves or a new super fast hash function that everyone's itching to jump on to ..). But, the multi* entries do present a bit of a special case, where they're the root of a larger ecosystem of types and are also likely to be stacked. So there's potential justification in there. I still think 0x0d1d is better but that's between you and the crew over in https://github.com/ChainAgnostic/multidid/pull/2.

@vmx any thoughts on this?

vmx commented 1 year ago

I'm with @rvagg here. I would only add things to the 1 byte range if there already is wide adoption, or if there's a technical/very good reason why it has to be a single byte. I also agree that multi* things are kind of a special case, though. Hence I come to the same conclusion that 0x0d1d would be a good fit.

oed commented 1 year ago

I would only add things to the 1 byte range if there already is wide adoption

This seems a bit contradictory. Wouldn't changing the code introduce a breaking change?

vmx commented 1 year ago

I would only add things to the 1 byte range if there already is wide adoption

This seems a bit contradictory. Wouldn't changing the code introduce a breaking change?

It could also be a "soft" breaking change. You could have the old an the new one assigned.

Looking at other Multiformats/IPLD related things. We are currently talking about a CIDv2 (likely will be something else). The key ordering of DAG-CBOR is a pain (the recommended order changed in the most recent CBOR RFC), so a v2 would be cool (won't happen though). Even Multihash is having interesting challenges with parametrized hashes. I know almost nothing about DIDs, but perhaps that's even an opportunity. Start with a number in the 2 bytes range, get wide adoption and then even have the chance to change that one little thing you've originally missed and combine that with getting something in the 1 byte range.

oed commented 1 year ago

Updated the number based on the discussion.