multiformats / multihash

Self describing hashes - for future proofing
https://multiformats.io/multihash/
MIT License
881 stars 112 forks source link

Capture payload size in the multihash #163

Open Gozala opened 3 months ago

Gozala commented 3 months ago

Almost in all instance where we use raw multihash we find ourselves capturing payload size an the side. It is also probably worth calling the fact that somewhat recently fr32-sha2-256-trunc254-padded-binary-tree multihash was defined to capture payload size to address potential vulnerabilities.

Given how common it is to want to capture payload size I would like to propose "multihash multihash" format that is multihash variant that uses multihash code 0x31 and encodes information about payload size and digest. Here is the exact format I'd like to propose

Format

<0x31><varint payload size in bytes><varint hash function code><varint digest size in bytes><hash function output>

FAQ

Gozala commented 3 months ago

I should note that it was suggested to me to create a PR for this repo and perhaps call this multihash v2, however as per FAQ I don't feel like using it everywhere we use multihash is better not to mention pain of upgrade it would introduce. That said I think it is good idea to have a format for a fairly common (at least in my experience) use case that can be recommended in place of sidecar size field.

If there is both support and desire to make this into a real think I can take write something more formal, but even then could use some feedback in regards where description of this document should live and what format should it have.