multiformats / multicodec

Compact self-describing codecs. Save space by using predefined multicodec tables.
MIT License
340 stars 204 forks source link

Add Skynet codecs #198

Closed peterjan closed 3 years ago

peterjan commented 4 years ago

This PR adds a Skynet codec to ultimately add ENS support for Skylinks.
Skynet is a decentralized CDN built on top of the Sia network, find out more here.

peterjan commented 4 years ago

@vmx @rvagg @Stebalien, I apologize for pinging you directly, but would it be possible for one of you to take a look at this PR? I am trying to add ENS support to Skynet and having a codec is a requirement ^^

Stebalien commented 4 years ago

Could you write up a very brief summary for what each one of these is? It helps to ensure they're categorized correctly. Specifically:

Other: This table is sorted by codec. Could you move this section up?

Off topic: I noticed you're hard-coding base64 for skylink links. I'd like to suggest that you use multibase (https://github.com/multiformats/multibase/), you'll save yourself some pain down the road when you, e.g., try to integrate with browsers.

peterjan commented 4 years ago

@Stebalien Thank you for getting back to me.

skynet-skylink-v1 is an identifier for what we call a Skylink, it is very comparable to an IPFS hash, and we do plan to upgrade to v2 at some point, to extend the schema or make things possible that we have currently not foreseen

Thank you for the tip on https://github.com/multiformats/multibase/ - we will definitely look into it, we probably can't make that change though as we do not want to exceed 46 bytes at all costs and we have already optimised to get it down to 46 as is.

P.S.:

Other: This table is sorted by codec. Could you move this section up?

oops, sorry for not doing that, might be a nice addition to the validator perhaps?

aschmahmann commented 4 years ago

@peterjanbrone

Multibase is a protocol for disambiguating the encoding of base-encoded (e.g., base32, base36, base64, base58, etc.) binary appearing in text.

This means that your size in the binary/wire representation will not increase by using multibase only the text representation will.

peterjan commented 4 years ago

@peterjanbrone

Multibase is a protocol for disambiguating the encoding of base-encoded (e.g., base32, base36, base64, base58, etc.) binary appearing in text.

This means that your size in the binary/wire representation will not increase by using multibase only the text representation will.

I see, thank you, I will definitely look into it! I do fear it's not something we are likely to adopt any time soon seeing as it would change the textual representation of a Skylink. We need that to remain the same for the time being.

aschmahmann commented 4 years ago

I do fear it's not something we are likely to adopt any time soon seeing as it would change the textual representation of a Skylink. We need that to remain the same for the time being.

That's totally understandable concern, changing user facing identifiers is hard. IPFS actually experienced similar issues around the introduction of CID (i.e. CIDv0 has implicit assumptions like the base encoding and the codec that were needed to make existing tooling compatible with the new CID format).

However, I'm a little confused. You've said you'd like to use this codec in a CID and the CID (on the wire and text) is going to be different from how you're working with existing Skylinks (on the wire and text) now (e.g. the newly reserved codec number will show up in the CID) how are you planning to keep things the same? Additionally the CIDv1 spec comes with multibase included so I'm not sure I understand why you think it wouldn't be compatible.

One more thing. I'm not an expert on ENS by any means, but if you're just looking for ENS compatibility their spec https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1577.md seems to imply that just having the namespace is enough for things to work. For example just having <skylink-ns><skylink-address> should be sufficient.

Arachnid commented 3 years ago

Correct - you don't have to use CIDs if they're not appropriate; you just need to obtain a multicodec namespace ID, and encode the content hash field as <skylink-ns><skylink-address>, specifying this format somewhere (such as in an EIP). For compactness it'd be preferable if the address was in binary form rather than text format for compactness.

peterjan commented 3 years ago

I have dropped all codecs but one, being the skynet-ns. I think by doing so I have addressed all of the above comments and concerns. Thanks for all of the advice and feedback!

peterjan commented 3 years ago

@vmx @rvagg @Stebalien, if you would be so kind to give this another review please. We would like to move ahead and integrate ENS as soon as possible.

rvagg commented 3 years ago

I will defer to @aschmahmann & @Stebalien on this one as they understand the domain better

aschmahmann commented 3 years ago

@peterjanbrone @rvagg yep, the 3 byte namespace LGTM :+1: (after the small table formatting issue above is resolved)


@peterjanbrone as an aside if you're interested (now or in the future) in getting Skynet to utilize CIDs I'd be happy to discuss it with you, just give me a ping :smile:

peterjan commented 3 years ago

@rvagg updated, thanks!