multiformats / multicodec

Compact self-describing codecs. Save space by using predefined multicodec tables.
MIT License
336 stars 201 forks source link

Keyed hashes (SipHash) #265

Open TotalKrill opened 2 years ago

TotalKrill commented 2 years ago

Basically, a widely used hash function ( at least internally in software ) is SipHash, which needs a key to be verified is missing from the codec table.

I have a use case that would require a hash-function that produces small digests ( ideally max 32bits ) for IoT usecases.

The recommended parameters are SipHash-2-4 for best performance, and SipHash-4-8 for conservative security. A few languages use Siphash-1-3. (from Wikipedia )

Is there anything that prohibits this from being added in multicodec? In this case I suppose it will not, but for my intended usecase, I suspect that maybe it will be problematic in for example multihash

rvagg commented 2 years ago

Gee, this one's a bit of a stretch. In theory we should just be able to add all of the useful (or valid?) members of the SipHash family, but keyed hash functions face a utility problem for content addressing, which is the primary use of multihash in that you also need the key to do anything useful with it.

But, the multicodec table, nor the multihash "spec" (afaik) don't dictate how they are used or constraints of systems that may prohibit their use in conjunction with a key, so I suppose there shouldn't be anything preventing addition of keyed hash functions. Maybe a bit of finesse applied to the comment column would be appropriate; but ultimately it tends toward caveat emptor around here (if someone wants to make CIDs with SipHash and discovers they're less useful for sharing data, then they'll discover that fairly quickly I think!).