prysmaticlabs / hashtree

SHA256 library highly optimized for Merkle tree computations
MIT License
24 stars 6 forks source link

Name functions specifically to their 64-byte constraint #1

Closed arnetheduck closed 1 year ago

arnetheduck commented 2 years ago

Currently, functions implemented in this library are named sha256_XXX which is somewhat prone to confusion and name conflicts: these are not "general" sha functions since they only support 64-byte inputs - because the C symbol namespace is shared, there's a high risk of conflicts with "normal" implementation that take arbitrary length data.

sha256_64_XXX is one option.

potuz commented 2 years ago

An alternative is to use runtime CPU detection and export only a single header function hash which is the approach taken in gohashtree. This makes it slightly less portable in that it would need cpuid.h, but seems that the UX would be better?

arnetheduck commented 2 years ago

I'm not sure I follow - the functions are implemented in the .S file and are exported using their "optimized" names (sha256_1_sse) - this name occupies an entry in the global symbol table (when linking statically for example) - runtime CPU detection picks one of the optimized versions, but in order to do so, it has to import the symbol from the assembly file (via extern). By the time we get here, the conflict (with another optimized sha256 library) will already have happened.

Ditto for the documentation - sha256 implies this is a "full" algorithm, which is confusing.

See also https://github.com/status-im/nim-ssz-serialization/pull/35/ which provides an example of the dynamic selection in C.

potuz commented 2 years ago

Sure, I'll change the names accordingly for the assembly global symbols, but I can hide them to something like _sha256_64_sse, and only export a single function in the only header that does the selection. I think we have similar algorithms to check with CPUID https://github.com/potuz/mammon/blob/main/ssz/hasher.cpp#L43

arnetheduck commented 2 years ago

. I think we have similar algorithms to check with CPUID

Right - I took the flags from gohashtree instead which seems like a safer bet, in terms of being up-to-date - in particular, it checks for multiple features for some of the algos which seemed safer, but then again, it could also be wrong / unnecessary.

arnetheduck commented 2 years ago

something like

a foolproof name would perhaps be hashtree_sha256_64_... - blst for example exports its functions with a blst_ prefix.