Closed heejaechang closed 5 days ago
That can be done by pushing individual case-normalized characters
Could you link me to where it supports pushing characters? Thanks!
I assume you don't care about endianness if you're wanting to hash System.String values directly?
Correct. :)
I've updated the API since the last version published to NuGet, but you can try it out with the latest CI build
You can use Blake2b.CreateIncrementalHasher()
, which will return the hash state struct. That has an Update()
that accepts a value or Span of value:
So you can call that with aString.AsSpan()
or you could case-normalize a string a chunk at a time into a fixed buffer, or just grab a character at a time to update the hash state. Updating the state simply pushes new bytes into a buffer until a block is full at which point the actual hash state is updated, so it's very lightweight.
FYI on 64-bit platforms, SHA512 tends to outperform SHA256. Especially for larger inputs (anything over a few dozen bytes), as is the case here. If you're going for raw speed and you need to use something built-in, it could be a stop-gap measure.
For that matter, MD5 is faster than both SHA2 variants on both platforms if you don't require cryptographic security.
on 64-bit platforms, SHA512 tends to outperform SHA256
BLAKE2 has similar characteristics. With scalar implementations, the 256-bit BLAKE2s variant runs faster on 32-bit while 512-bit BLAKE2b runs faster on 64-bit. With SIMD implementations, BLAKE2b is always faster.
I'm fine with any system that meets the requirements stated in https://github.com/dotnet/roslyn/issues/33411#issuecomment-465272354. We don't need cryptographic security. We just want a reasonable hasher.
@CyrusNajmabadi In addition to those requirements, it can't be MD5 or SHA1.
interesting, is that an external requirement/mandate @tmat? Nothing about the scenrios where we uses these hashes seems like it would preclude those (at least from Roslyn's perspective).
Thansk @tmat . Have added that to our criteria.
May be worth taking a look at BLAKE3 here -- it's much faster than SHA-1, SHA-256, MD5, and blake2.
Closing. We moved to xxhash128
@sharwell believe SHA1 is too slow for our checksum. so he wants us to use a different algorithm than SHA1 to make hash faster.