ipld / specs

Content-addressed, authenticated, immutable data structures
Other
592 stars 108 forks source link

hashmap: switch default hashAlg to SHA2-256 #316

Closed rvagg closed 3 years ago

rvagg commented 3 years ago

I kind of want to see some performance numbers on this and am tempted to add back in Murmur as an optional but explain the reasons why you might choose one over the other. Jeromy said (regarding this switch that Filecoin made) that SHA2-256 was a bit faster than Murmur on the CPUs that Filecoin has been focusing on (which have SHA2 instructions). But for JavaScript, the delta could be quite large in the other direction but I just don't know. Certainly if you want a cryptographically secure HAMT then go SHA2-256 ... and maybe a secure default is something we should be shipping anyway because these kinds of choices are nuanced—hash collision attacks are a consistent feature of programs and runtimes and we don't seem to be getting better at identifying where they are likely to show up, maybe we do our users a favour by opting for secure by default.

mvdan commented 3 years ago

No strong opinion here, but I agree that we should be secure by default and drive this decision with benchmarks.

warpfork commented 3 years ago

maybe we do our users a favour by opting for secure by default.

:100:

IMO DoS vectors are perfectly real threats, and especially so in data structures that are containing relatively "untrusted" user input... which is the case a decentralized data structure always finds itself in. Using cryptographic hashes for the sharding is a very meaningful improvement to the stability of this structure, and we should do it -- if not always, then at least absolutely by default.

warpfork commented 3 years ago

This seems like almost an auto-merge to me, reviewing this again today?

We kinda... really don't like murmur, it turns out, and hadn't even specified what that is particularly well when we looked at it again recently. So I'm extremely enthusiastic about removing it from this spec file in favor of literally anything.

I think we're also ending up with implementations parameterizing this pretty consistently, so the text about what's recommended here isn't especially load bearing and there just isn't a reason to be giving a massive amount of caution around change.