Closed IS4Code closed 1 year ago
ooo, identity, I hadn't thought about that; the problem with changing that is that we do use it in CIDs and it's neither a cryptographic nor a noncryptographic hash.
I don't think we should be changing it. Maybe we should special-case this somewhere, perhaps in the multihash README / spec.
@vmx what's your take on this?
oh, @vmx requested it in https://github.com/multiformats/multihash/issues/157! surprising.
So a problem with doing this is that it will end up being special-cased somewhere in the stack. For go-multicodec which does codegen from the table, it'll not show up as a valid "multihash" for use with CIDs, so that's probably going to need a special-case for it.
I think I'm -1 on changing it since this doesn't improve the situation, "identity" isn't a hash, it's not a one-way function, it's a special-case.
I think I'm -1 on changing it since this doesn't improve the situation, "identity" isn't a hash, it's not a one-way function, it's a special-case.
That's fine with me. I just thought "hash" is better then "multihash" even if it's wrong, but that might not be true. I agree with @rvagg here, so let's leave "identity" just as it is in this weird special case state. Sorry @IS4Code for the extra work, but please change it back.
Contrary opinion: we use multihash
as a (bad) synonym for cryptographic hash
or more specifically in the cid use case collision-resistant-hash
. In other words we use the multihash
column as a proxy for "is this at all adversary resistant?". In this context identity
gives the strongest possible guarantee: there are no possible "collisions" to be found now or in the future against such "content addressing".
Leaving aside what multihash
means: in the context of the table being edited identity
and murmur
could not be further apart on the safety spectrum
Contrary opinion: we use
multihash
as a (bad) synonym forcryptographic hash
or more specifically in the cid use casecollision-resistant-hash
. In other words we use themultihash
column as a proxy for "is this at all adversary resistant?". In this contextidentity
gives the strongest possible guarantee: there are no possible "collisions" to be found now or in the future against such "content addressing".
The question is not whether identity hash is cryptographic or not, it's whether it's a hash function or not. There are many definitions out there, but let's take the one from Wikipedia:
A hash function is any function that can be used to map data of arbitrary size to fixed-size values.
In the identity hash case it doesn't hash to a fixed-size value, it depends on the input. => not a hash function.
It's subtle. Focus on this part:
In other words we use the multihash column as a proxy for "is this at all adversary resistant?"
In other words we use the multihash column as a proxy for "is this at all adversary resistant?"
That depends on what you mean with "adversary resistant". You mean "collision free", that's true. But for hash function you usually also take preimage resistance into account, which means that you cannot get the original input data from the hash. For identity that's the case.
Preimage resistance in the context of CIDs is moot: their overwhelming purpose is to point to the original content, which is in turn discoverable in a location-less manner from multiple parties.
Preimage resistance in the context of CIDs is moot
Multihashes are not only used for CIDs.
In the identity hash case it doesn't hash to a fixed-size value, it depends on the input. => not a hash function.
Not disagreeing in general, but multihash itself is also technically not a fixed-size value, or at least it shouldn't be treated as such, although it does not depend on the size of the data. I've also seen situations where hashing individual large chunks in a file and concatenating the result could be treated as a hash of the whole file, despite proportional to the original size.
2023-01-24 IPLD maintainer conversation: can we push ahead without identity (avoid scope screep). We can deal with that as a separate issue?
just noticed that identity
was removed from here, so this is merged now, thanks!
Murmur3 hashes changed toidentity
andhash
, per multiformats/multihash#157.