Open sofia-snow opened 2 years ago
Interesting question. My instinct is that for most applications where constant-time ordering isn't an issue, the easiest thing is to convert to [u8; 32]
with .into()
. Given that option, I think if we were going to add an Ord
/PartialOrd
impl in this crate, it would make sense for it to be constant time. That way we maintain the general rule that "you can convert to [u8; 32]
when you don't care about constant-time stuff."
For what it's worth, I've often thought about splitting out the Hash
type into its own crate. Especially now that const generics are stabilized, it might make sense to revisit that. I wonder if the RustCrypto folks would be interested in sharing a single type here.
My understanding is not all of const-generics have landed, and RustCrypto need more than currently offered even by nightly.
https://github.com/RustCrypto/traits/issues/238#issuecomment-806931108
@oconnor663 Hi, is there any work done on that? I am trying to use Hash
on my BtreeMap
but I need to convert it to hex every time. :(
I haven't done any work on this myself, no. It looks like the right way to do it will be to replace our dependency on constant_time_eq
with a dependency on subtle
, which can do both Eq
and Ord
. If anyone's interested, this is probably a good beginner PR.
In the meantime, you might want to convert Hash
value to bytes rather than hex, either with .as_bytes()
or with .into()
. That'll be the same size as the original Hash
, which might be important if you're storing lots of them.
Shouldn't https://github.com/BLAKE3-team/BLAKE3/pull/203 be merged before eventually finding a better alternative?
Also why is a constant-time Ord
implementation required?
After thinking a bit about how it could be done when comparing 2 Hash I realised that having a constant-time implementation for this would require to always go through the whole byte array. This would be the equivalent of the worst case scenario of the default implementation which stops the comparison whenever the bytes aren't equal, so I don't see why that would make sense.
Same reason Hash
has a constant time Eq
implementation. Though I can't think of an attack off-hand, I would bet there's some use case out there that leaks information though the time it takes to compare (probably many) hashes - especially if you could control what hashes it's being compared to...
Anyway, I'm glad there's already an issue open here with something of a plan, I'll take a look at subtle
.
Basically whenever Blake3 is used as an HMAC you really really want CT comparison https://threatpost.com/timing-attack-google-keyczar-library-060209/72738/
I'd also say that I doubt this could be a bottleneck anywhere, it's 4 words xoring them instead of branching over them might even be faster
One issue to keep in mind here is that, even if Hash
provided a constant-time comparison, you could still leak information by using it in a data structure that doesn't make higher-level constant-time guarantees. For example, suppose we have a search tree of MACs, and our application takes a MAC from the user and asks "is this in the tree?" Maybe the tree looks like this:
mac2
/ \
mac1 mac4
/ \
mac3 mac5
Even if comparing individual MACs is constant time, the tree structure itself still has a timing leak. If I query it with a MAC that's less than mac2
, it's going to do 2 comparisons. But if I query it with a MAC that's greater than mac2
, it's going to do 3. If I can measure that timing difference, I can work from the high bits down to the low bits and recover mac2
in ~256 queries! (In general, an oracle that answers "is your MAC equal to the valid one?" is normal and no problem, but an oracle that answers "is your MAC greater than the valid one?" is broken and quickly leaks the valid MAC.)
We could probably come up with some way to fix the tree. We could add a requirement that all the leaves are at the same depth, and we could design some sort of "empty node" concept to make that work. I'm sure someone somewhere has done this before, but it's a very exotic problem to be solving, and no ordinary tree-map implementation is going to do anything like this.
So one worry I have about a constant-time Ord
implementation, is that it might inspire false confidence that putting MACs inside a BTreeSet
is safe. But that's not safe, and there's nothing Hash
can do to make it safe. I'm curious what other use cases folks here have in mind.
So one worry I have about a constant-time
Ord
implementation, is that it might inspire false confidence that putting MACs inside aBTreeSet
is safe. But that's not safe, and there's nothingHash
can do to make it safe.
I agree with that concern. Even in a non-MAC scenario it could be a problem - if you can fetch data based on it's hash, the time until cache miss could reveal blocks that are present in the cache.
I'm curious what other use cases folks here have in mind.
I don't think I have an actual use case for a constant-time Ord
implementation so much as any Ord
implementation. Mainly for things like sorting keys during serialization to remove hash map leaks.
This https://github.com/BLAKE3-team/BLAKE3/pull/267#issuecomment-1321273326 seems relevant over here.
Actually I should've said that it's not clear (to me) that we want any
Ord
implementation at all. In my mind, forcing the caller to convert hashes to bytes to sort them has some value, because it makes it clear that the comparison semantics are the regular[u8]
comparison semantics. Given that we do encourage users to rely on our constant-timeEq
impl, I worry it would be confusing to expose anOrd
impl without making a similar guarantee.
So the recommendation for types that want an Ord
implementation and to contain a Hash
would be to either make a wrapper, write it manually, or embed it as a [u8; 32]
- which I guess is what everyone is doing now anyway :)
Yeah if you need to sort hashes or something like that, my instinct is to convert them to [u8; 32]
. In --release
mode that conversion is almost certainly a no-op, and it makes it maximally clear what's going on.
@oconnor663 One mild annoyance with using [u8; 32]
, though, is that it doesn't provide convenient operations to format it as hex, the way blake3::Hash
does.
Yeah, in practice I make my own wrapper around [u8;32]
(with all the implementations I want) and convert back and forth as needed.
In particular I'm finding this issue not a big deal because I also am adding implementations for serialization and other nonsense that doesn't belong in the blake3 crate, and there's no OrphanInstances
in rust so I'd need to make it anyway.
Without an
Ord
implementation (or even aHash
implementation)blake3::Hash
values cannot be used as keys inBTreeMap
s orHashMap
s, among many other valid uses for (vartime) orderings.To be safe; these implementations ought to be constant-time. Either way, the subtle crate provides traits which are explicitly intended for constant time equality and ordering.
As a workaround, I'm using the newtype pattern to implement these traits on my own
OrdHash
type.If a PR would be welcome for this addition, should the implementation be constant time or variable?