tkaitchuck / aHash

aHash is a non-cryptographic hashing algorithm that uses the AES hardware instruction
https://crates.io/crates/ahash
Apache License 2.0
986 stars 94 forks source link

feature request: ahash without length prefixing #198

Open nwalfield opened 6 months ago

nwalfield commented 6 months ago

In Sequoia, we currently use xxhash to compare streams. ahash appears to be better than xxhash, because, as discussed in your README, it is faster, and it is already used by our other dependencies. The problem is that ahash appears to automatically adds length prefixes.

First, perhaps I'm holding it wrong. In that case, I apologize in advance for the noise, but would appreciate any tips.

As an aside, I'm a bit confused by this note in the Rust documentation for Hasher::write:

Note to Implementers

You generally should not do length-prefixing as part of implementing this method. It’s up to the Hash implementation to call Hasher::write_length_prefix before sequences that need it.

I understand that to mean that an implementation of Hash should do length prefixing; an implementation of Hasher, like ahash's implementation should not do length prefixing, but it seems to. Is this correct?

Assuming ahash's implementation is okay, I'd like to suggest a variant that can work on streams by not doing length prefixing and not padding short writes.

tkaitchuck commented 5 months ago

When this PR merges into the standard library, we will be able to remove the length prefixing: https://github.com/rust-lang/rust/issues/96762 Until then any hasher which does not work the way that sip-hash does where the algorithm depends only on the byte sequence rather than the calls: IE: if h.hash_slice(&[a, b]); h.hash_slice(&[c]); is not guaranteed to be the same as h.hash_slice(&[a]); h.hash_slice(&[b, c]); then the hasher would be vulnerable to a DoS attack. This includes XXHash.

tkaitchuck commented 5 months ago

In the mean time, if you want to avoid the extra call, you can call write on the hasher rather than hash on the object.

nwalfield commented 5 months ago

Thanks for the explanation, and the tip!