multiformats / multicodec

Compact self-describing codecs. Save space by using predefined multicodec tables.
MIT License
337 stars 201 forks source link

Add AES & ChaCha keys #228

Closed expede closed 3 years ago

expede commented 3 years ago

Add common symmetric keys. We have a use case for storing & (securely) transmitting symmetric keys to encrypted data on IPFS.

Why No Operation Modes?

I didn't split out -CBC, CTR, -GCM, and so on because those are modes of operation, not the actual keys. AFAICT this also extends to XChaCha, which only accepts 256-bit keys but the contents of that 256-bit key is the same. We can add the operation modes to the format, but to me that feels like it belongs on the encrypted data, not on the key.

Why Separate Keys Types?

If we take the reasoning above to the extreme These are all random bits. In theory we could just have "256-bit symmetric key". Flagging the intention of the key feels right here ("oh this is a ChaCha key!") more than the operation mode did. But again, please let me know if you disagree!

Why 0xa?

AES starts with "A", and is an AEAD cipher ๐Ÿ˜› I'm not attached to the number range โ€”ย let me know where to move it and it shall be done!

expede commented 3 years ago

@b5 Any thoughts?

b5 commented 3 years ago

What if we dropped key length prefixes entirely & just relied on the length of the key bytes within the CID? eg: if you're looking at a 0xa0 key that's 24 bytes long, you know you've got a 192 bit AES key. Ditto for ChaCha.

If that works we can drop to just two multicodec entries, one for each symmetric key type

expede commented 3 years ago

Yeah, totally, just parse the length. That could also be said of SHA, which has multiple length-based versions on this list. I wonder if there's a reason for that? Any idea who the right person to ask about the original reasoning would be?

vmx commented 3 years ago

The length in the multihash is meant to identify the length of the bytes to come, not the length of the original digest. The idea is that you can also store truncated hashes. So if the resulting hash is different per variant (like in SHA2), it should really be different multicodecs. If the truncated hash is the same as the longer one (like e.g. BLAKE3) there would only be one variant.

I though this would be documented somewhere, but I can't find anything, I've opened https://github.com/multiformats/multihash/issues/133 in hope someone will find the time to document it properly.

b5 commented 3 years ago

Delighted to learn the reason for length-variant multicodecs. Looks great!

expede commented 3 years ago

@rvagg

Let us know if you're happy with these as they are and we can merge.

I'm happy; let's merge it! ๐Ÿšข

expede commented 3 years ago

Thanks ๐Ÿ˜„