jedisct1 / libsodium

A modern, portable, easy to use crypto library.
https://libsodium.org
Other
12.15k stars 1.73k forks source link

Clarifying keyed BLAKE2b bit security #1033

Closed samuel-lucas6 closed 3 years ago

samuel-lucas6 commented 3 years ago

I would like to ask what the bit security is for keyed BLAKE2b and BLAKE2b when used as a KDF. My thinking being that using it as a KDF to derive a 256-bit encryption key may be the weakest link in the chain.

The RFC notes that the bit security is 128-bit for unkeyed BLAKE2b-256 (the collision resistance), but it's unclear whether keyed BLAKE2b or using it as a KDF offers a different level of security. Is it all just related to the output length?

I should explain that my confusion stems from the fact that the information on HMAC is mixed. I've seen people saying that HMAC-SHA256 offers 128-bit security, whereas others claim it offers 256-bit security because collision resistance isn't an issue with HMAC. Then HKDF-SHA256 apparently offers 256-bit security because it's bit security is related to the size of the input keying material.

Thanks. Sorry for asking so many questions. I never know who to ask for questions like these.

jedisct1 commented 3 years ago

https://crypto.stackexchange.com may be a better place to ask for questions like these, if only because this is where most other people looking for such information are going to look at first.

The libsodium issue tracker is for reporting bugs and discussions directly related to the project.

What the security level means depends on what attacks are relevant to your protocol. min(key_size, sqrt(output_size)) may be a reasonable general estimation.

samuel-lucas6 commented 3 years ago

I probably should have opened this in the libsodium-doc repo because you mention the security of the cipher being reduced to the one of the hash function here, but it isn't clear what the security of the hash function actually is.

BLAKE3 defines the security level as 128-bit in their paper, but I couldn't find any security level discussion in the BLAKE2 documentation. BLAKE2b is defined as 256-bit on Wikipedia, but that obviously isn't true in all cases. Answers to these questions don't seem to be documented, or they're hidden very well.

As for Cryptography Stack Exchange, I have asked questions there in the past, but the answers typically aren't particularly helpful, nobody knows the answer, or the answer seems to be outright incorrect. I wouldn't necessarily class it as a reliable source of information.

Anyway, I completely understand your point. I'll stop pestering you.

emilbayes commented 3 years ago

You Wikipedia link gave you a pretty good answer: approx 256 bits for Blake2b-512 (sqrt(2^512) = 2^256). This is the collision/birthday attack on the hash function itself. Since this is a collision attack (two distinct inputs yielding the same key in your case) this would be your upper bound. Lower bound would be entropy of the input to your KDF. I don’t know if this is your question, but it comes very close https://crypto.stackexchange.com/questions/88248/hkdf-bit-security

emilbayes commented 3 years ago

If you’re interested in learning more about these notions the Blake book is a very good read

samuel-lucas6 commented 3 years ago

You Wikipedia link gave you a pretty good answer: approx 256 bits for Blake2b-512 (sqrt(2^512) = 2^256). This is the collision/birthday attack on the hash function itself. Since this is a collision attack (two distinct inputs yielding the same key in your case) this would be your upper bound. Lower bound would be entropy of the input to your KDF. I don’t know if this is your question, but it comes very close https://crypto.stackexchange.com/questions/88248/hkdf-bit-security

Thank you. It seems that BLAKE2b-256 as a KDF offers 256-bit security because collision resistance doesn't really matter in that context. Yes, that was my question. I perhaps should have split it into multiple questions, but not many people seem to know much about BLAKE2 on there based on my experience.

emilbayes commented 3 years ago

Perhaps it was a typo but blake2b-256 would only give you 128 bits (remember sqrt(2^256) = 2^(256/2) = 2^128)

samuel-lucas6 commented 3 years ago

Perhaps it was a typo but blake2b-256 would only give you 128 bits (remember sqrt(2^256) = 2^(256/2) = 2^128)

I agree that BLAKE2b-256 provides 128-bit security, but I have been told that collision resistance isn't particularly important when it comes to key derivation (like with HKDF as I mentioned in the first message). However, I guess it's still safest to assume the worst.

samuel-lucas6 commented 3 years ago

Perhaps it was a typo but blake2b-256 would only give you 128 bits (remember sqrt(2^256) = 2^(256/2) = 2^128)

@emilbayes After doing some more research, it's still not clear what the correct answer to my question is. On the Cryptography Stack Exchange, different people are giving different answers, although none of these are about BLAKE2b specifically. However, I believe the same principles should apply since SHA256 offers 128-bit collision resistance for example.

I'm sure there are more examples. I might not be interpreting some of the responses correctly, but it feels like very few people actually know what the security level of a MAC is. This topic doesn't seem to be documented well at all.

As for BLAKE2b, there's barely any information about using it as a KDF. Moreover, like I've said before, there doesn't seem to be much information about BLAKE2 on the Cryptography Stack Exchange either. However, several people I've spoken to suggest that the collision resistance doesn't really apply without explaining why. I'm guessing their reasoning is again due to the use of the key.

I hope you can understand my confusion and reasoning for asking this question.

Edit: The book Cryptography Engineering apparently says that a 256-bit MAC provides 128-bit security, although that still doesn't explain how HKDF with SHA256 has 256-bit security.

It makes sense that collision resistance is important, which is why I don't understand why people are providing different answers.

mouse07410 commented 3 years ago

Hash functions prior to the SHA-3 competition, did not have "output pseudorandomness" as a design requirement.

All of the SHA-3 candidates can be used as KDF, without the need to add HMAC construct.

Edit: attacks against KDF differ from those against MAC. That explains different security properties.

emilbayes commented 3 years ago

Edit: The book Cryptography Engineering apparently says that a 256-bit MAC provides 128-bit security, although that still doesn't explain how HKDF with SHA256 has 256-bit security.

I have the book here and it specifically says:

However, HMAC - Like CMAC - is still limited to n/2 bits of security, as there are generic birthday attacks against the function that make use of the internal collisions of the iterated hash function.

However, I think what you are really asking, and what you got answered on Stack Exchange, is; what is the entropy of a subkey m given a master key M of n bits of entropy. The answer there is at most the output of the hash function and at least the entropy of the master key, but that's assuming the hash function is a random oracle. Attacks on the hash function itself are only relevant to you if you intend on sharing the subkey itself.

The reason HMAC and HKDF come up is because what you are really looking at here is a PRF (pseudo random function), with a secret key K and some kind of tweak, eg. a counter like in libsodium's crypto_kdf API.

emilbayes commented 3 years ago

To answer the question about preserving entropy, here's one on Crypto Stackexchange: https://crypto.stackexchange.com/a/10405

samuel-lucas6 commented 3 years ago

Edit: The book Cryptography Engineering apparently says that a 256-bit MAC provides 128-bit security, although that still doesn't explain how HKDF with SHA256 has 256-bit security.

I have the book here and it specifically says:

However, HMAC - Like CMAC - is still limited to n/2 bits of security, as there are generic birthday attacks against the function that make use of the internal collisions of the iterated hash function.

However, I think what you are really asking, and what you got answered on Stack Exchange, is; what is the entropy of a subkey m given a master key M of n bits of entropy. The answer there is at most the output of the hash function and at least the entropy of the master key, but that's assuming the hash function is a random oracle. Attacks on the hash function itself are only relevant to you if you intend on sharing the subkey itself.

The reason HMAC and HKDF come up is because what you are really looking at here is a PRF (pseudo random function), with a secret key K and some kind of tweak, eg. a counter like in libsodium's crypto_kdf API.

Ok thank you, that makes sense. So the security level of a MAC such as keyed BLAKE2b or HMAC is half the output length rather than related to the size of the key. Got it.

approx 256 bits for Blake2b-512 (sqrt(2^512) = 2^256). This is the collision/birthday attack on the hash function itself. Since this is a collision attack (two distinct inputs yielding the same key in your case) this would be your upper bound. Lower bound would be entropy of the input to your KDF.

The output having a high entropy makes perfect sense. My other question is more about whether the collision resistance is important. My surprise at HKDF with SHA256 providing 256-bit security is related to this idea of collisions.

My concern is that if I'm using BLAKE2b-256 as a KDF to derive a 256-bit encryption key, then I'm limiting the entire system to 128-bit security. Two inputs can theoretically collide after 2^128 operations. Therefore, even though the input keying material is 256-bit and the output keying material is 256-bit, there will be a collision before 2^256 different keys are used as input keying material.

Doesn't that mean the KDF can only be used to produce 2^128 unique keys? It's halving the number of possible keys that can be derived. I know that's an extremely large number, but that's still half of 2^256 (a much larger number) and theoretically impacts things like nonce reuse because the same keys will be derived again at a certain point.

Maybe I'm just being overly paranoid, but I would argue that there's no harm (besides sometimes storing more data) opting for a goal of 256-bit security. So my final question is whether it's possible to use BLAKE2b to derive a 256-bit key with 256-bit security (aka 256-bit collision resistance). I'm guessing the answer is no as you'd need an output length of 512-bit for 256-bit security.

Any recommendations for 256-bit security level key derivation would be welcome. Huge thanks for all the explanations @emilbayes :)

jedisct1 commented 3 years ago

Assuming a uniform distribution, a collision is already bound to occur after 2^128 randomly generated 256-bit values, no matter what functions these will eventually be used for.

But this is not something to worry much about. 2^128 is not just a big number. The universe will go extinct before you will have generated so many keys, not even considering the communication and storage requirements exploiting this collision would need.

emilbayes commented 3 years ago

What @jedisct1 just said, but also I think you are misinterpreting what I said. Collision resistance is important, but see Thomas Porrins answer on the theory. If you use Blake2b-512, with a 512 bit random key, then you should get (1 - 1/e) * 2 ^512 bits of entropy preserved through the hash function, which is approx 2^511.33 bits of entropy (log2(1 - 1/e) ~ -0.6617..., ie. (1 - 1/e) ~ 2^(-0.6617...)). How you use that key is what decides how entropy is further lost. Also to echo what @jedisct1 just said, most ciphers aim for 2^128 bits of security for exactly the reasons he mentioned.

2^128 is not half of 2^256, but the square root. 2^255 would be half of 2^256 (since 2 * 2^255 = 2^1 + 2^255 = 2^(255 + 1)).

The reason collision attacks keep coming up with HMAC is that most of the time, the HMAC output is exposed in a way that any adversary can observe it, and hence try and exploit this "weakness".

I understand where you are coming from wrt your concerns, but you are moving back and forth between applied and theoretical cryptography. Keep in mind which is which :)

samuel-lucas6 commented 3 years ago

Oh dear... Maths never was my strong point. Thank you very much @jedisct1 and @emilbayes. The whole 'quantum computers are going to break the internet, upgrade to 256-bit security!' has clearly got to me. @emilbayes I'll reread that answer from Thomas Pornin. Sorry for asking/discussing this here @jedisct1.