Closed pedrocr closed 7 years ago
But if it's possible to generate a collision by just manipulating structure
Hash
andCryptoHash
are equally broken.
Incorrect. Again, hashDoS is an algorithmic attack that relies on the attacker being able to find large numbers of collisions. Finding a single collision is not catastrophic, because it doesn't help the attacker that much.
Furthermore, the interesting cases for hashDoS are HashMap<T, _>
for a single type T
. For typical Hash
cases we don't need to worry about collisions across types.
With a CryptoHash
, which calculates a content hash, any single collision is catastrophic, and we aren't constrained by T
and have to worry about collisions across types.
These are totally different threat models you are trying to collude. Hash
is not designed to solve the problem you're proposing.
My proposal is very simple. Do exactly what
Hash
does but feed it to a crypto hash. I've yet to see an attack that breaks that but it may exist.
The onus is on you to show why your scheme is secure. Otherwise you're shifting the burden of proof. Your (ill-defined) scheme is not secure simply because I haven't found an attack. You need to be able to answer questions like: how is the scheme domain separated everywhere type-by-type? How does it distinguish a Vec<u8>
containing the data that would be hashed for some struct
from the struct
itself?
A secure scheme will be demonstrably unambiguous and free of collisions for arbitrarily structured messages. Anything less is insecure.
This is a much higher bar than Hash
.
With a
CryptoHash
, which calculates a content hash, any single collision is catastrophic, and we aren't constrained byT
and have to worry about collisions across types.
If you care about collisions across types then yes, it's easy to generate collisions that are benign in Hash
but break CryptoHash
. Again, not my use case, but I can see how you'd want that in general and if there's a simple way to do that then great. objecthash
doesn't currently do that completely though as it hashes the same structure of fields to the same value independently of type. So objecthash(Person{id:0}) == objecthash(Dog{id:0})
. Generating extra colisions with Hash
isn't trivial actually but can be done with things like hash((0u16,0u16)) == hash((0u8,0u8,0u8,0u8))
. That's just a trivial bug (i.e., since Hash
doesn't care about different types it doesn't do the same as with Vec and also hash the length). But if there's a standard way to fix that all the better. I've had a second look at objecthash
and opened a bug report on it.
After looking at objecthash
what it does is create a hash that's interoperable between languages and thus has some situations that generate collisions on purpose:
First{id: 0}
and Second{id: 0}
will hash to the same value even if one uses u32
and the other u64
.This makes it easier to have values for which objecthash(a) == objecthash(b)
and yet myfunc(a) != myfunc(b)
. For some applications this is not ideal. Given this I'd say it would make sense to have a simpler scheme that doesn't have the issues of Hash
(e.g., tuples hash the length) but just does the hashing of all fields in order with all the contents.
hash((0u16,0u16)) == hash((0u8,0u8,0u8,0u8))
I would argue, if it's a bug, cause the byte representation of the data is the same. Hash functions work on bytes, not data structures. Let's say you have a vector in Rust and in Java, which are holding exactly the same data. A hash of those should be the same in both languages, even that the underlying implementation of vectors could be different. If we couple a hash function with the implementation of the data structure in Rust, we could make, that this hash would be usable only in Rust. What's even worse, changes to the compiler could affect the resulting hashes.
@Trojan295 I'm not sure if you're arguing that it's a bug or not. (0u16, 0u16)
and (0u8, 0u8, 0u8, 0u8)
are not holding the same data, one has 2 zeroes the other 4. The fact that end generating 4 zero bytes in memory in both cases is an implementation detail. And rust already does hash([0u16, 0u16]) != hash([0u8, 0u8, 0u8, 0u8])
because the Vec
length is already hashes. Only the tuple length isn't.
Are you trying to use those structured hashing functions only for internal Rust purposes like HashMap
or do you want to make them usable in cryptographic manner? For such internal use that ok, but hashes used in cryptography are mostly send to other entities. Now if you would define a custom data structure in Rust and apply some hash on it, then the other entity would need to know how did you calculate the hash of this structure.
Let's take this Vec
. Don't know how it's done in Rust, but let's assume that the length is appended to the data and hashed. Then the guy on the other side of the wire needs to know, how you build the byte stream, that was hashed (so that you appended the length, and not for ex. prepended). It's not simple to unify this across multiple parties.
Generally, I don't think such feature is required in case cryptographic hashes, as in case of crypto you mostly operate on numbers/bytes and not custom data structures. The way of hashing needs to be well known and unified.
@Trojan295 short of Hash
changing to a const-dependent pi-typed interface, this wouldn't work for things like HashMap
, which needs a different type signature.
But as I've previously stated, and if it's the thing I have a bug up my butt about, it's conflating security domains and concerns, and that's really what I want to avoid.
I am a huge fan of something like a CryptoHash
scheme and have already implemented one in Rust and plan on designing and implementing another. I just want to make sure it covers all of the concerns I have. Those concerns are orthogonal to what Hash
presently provides.
I will close this issue and will create a separate issue in the RustCrypto/utils
repository. Crate for cryptographically hashing structs would be a great addition to the project, but as I stated earlier it's better to do as a separate crate. I think concerns raised by @tarcieri are good ones and should be addressed in this crate.
I think a utility crate could provide a wrapper type struct HashWriter<'a, I: 'a + ?Sized + Input>(&'a mut I)
that impl
s io:Write
along with input_le
, input_be
, etc. We cannot make it totally generic over serde serializers anyways because most lack bincode::serialize_into
, but folks can work around that on a case by case basis.
@burdges
Could you please specify for that use cases io::Write
impl is needed and current digest_reader
is not enough?
std::hash::Hasher
can be derived for structs and is a standard hashing interface in rust. The standard interface only allows 64bit outputs but there's nothing stopping extra outpust tailored to specific hashes. So for ergonomic purposes wouldn't it make sense to have an adapter to allow using theHasher
API?