Open csosto-pk opened 2 years ago
I rephrased to
The shared secret, K, is defined in [RFC4253] and [RFC5656] as an integer encoded as a multiple precision integer (mpint). The hybrid key exchange establishes two a binary strings K_CL and K_PQ scalar multiplication and post-quantum KEM encapsulation ('Encaps'). K is the concatenation of the two shared secrets K_CL and K_PQ. The order of concatenation is
K = K_CL || K_PQ
This is the same logic as in [I-D.ietf-tls-hybrid-design] where the classical and post-quantum exchanged secrets are concatenated and used in the key schedule.
The concatenated bytes are converted into K by interpreting the octets as an unsigned fixed-length integer encoded in network byte order. The mpint K is then encoded using the process described in Section 5 of [RFC4251], and the resulting bytes are fed as described in [RFC4253]. The resulting bytes are fed as to the key exchange method's hash function to generate encryption keys.
@dstebila please check it out
We should add an EDNOTE explaining the Extract and Expand and using a hash in SSH too and discuss the topic in the WG.
What benefit(s) does this approach have over the existing approach? Is there an issue concentrating the two values through a hash function first?
There is also a performance aspect here. SSH uses the shared key K
6 times (both client and server-side) to generate session key material. The larger the K
, the more compression function invocations you need.
Just thoughts.
Hey @torben-hansen ,
Good point.
What benefit(s) does this approach have over the existing approach? Is there an issue concentrating the two values through a hash function first?
I was trying to be consistent with the TLS 1.3 Hybrid Key Exchange draft where they simply concat and then use it in the schedule. Given the size of the PQ shared secret, indeed hashing could add a few compressions.
I was also trying to be consistent with the proof in https://eprint.iacr.org/2018/903 which does not hash the concatenated keys.
We should add an EDNOTE to the draft to make sure we discuss this more in the list.
In the TLS 1.3 hybrid key exchange draft, we use the concatenation without hashing, but that concatenation is immediately fed into the TLS key derivation function which effectively does hash, and it's only the output of the TLS KDF that is a "named" key, e.g. the handshake secret. And then this (hashed) handshake secret is put into a KDF with the (hashed) transcript to make encryption keys, e.g. the client handshake traffic secret.
Mapping that on to SSH structures, one could go either way. Assuming the hash function is a random oracle, hashing the concatenation doesn't weaken security in any way.
If we do hash, we should be careful to make sure we're not losing security by hashing with a function whose output is smaller than the security level of the input schemes.
One issue that cropped on recently in the TLS 1.3 hybrid key exchange draft is the question of what happens if you assume the hash function is not collision resistant. There can be some quirks with concatenation if either of the shared secrets is variable length. Is that the case in SSH?
Agree @dstebila . Yeah SSH does not have the extract and expand steps as we know them and I was thinking to keep it simple.
But I am not against changing it. We ought to discuss it at least. We should add an EDNOTE explaining the Extract and Expand and using a hash in SSH too.
There can be some quirks with concatenation if either of the shared secrets is variable length. Is that the case in SSH?
SSH using the same ECDH and X25519 and PQ KEMs, so I would say no. But we ought to keep an eye for it. There is an EDNOTE in the Sec Considerations for this.
Thinking about this a bit more, SSH derives all these keys
HASH(K || H || "A" || session_id) HASH(K || H || "B" || session_id) HASH(K || H || "C" || session_id) HASH(K || H || "D" || session_id) HASH(K || H || "E" || session_id) HASH(K || H || "F" || session_id)
directly from K. They are not HKDF-expand like in TLS, but they resemble HKDF-expand.
In that case maybe it would make sense for us to introduce a Extract step like
HASH(K_PQ || K_CL)
or even something more like HKDF-Extract like
HMAC-HASH(K_PQ , K_CL).
The latter is a drastic change though for SSH I think.
Anyway, we out to spell this out as an EDNOTE to have a WG discussion.
You might be able to capture such extraction under the SSH key exchange method. The key exchange method is just defined to output a shared secret. There aren't many restrictions. So, this might not be as big a change.
or even something more like HKDF-Extract like HMAC-HASH(K_PQ , K_CL).
The SSH construction would then almost be the same as the dualPRF combiner in https://eprint.iacr.org/2018/903 , but not 100%....
Option 1) following original SSH logic with some performance overhead
K = K_CL || K_PQ Initial IV c2s: HASH(K || H || "A" || session_id) Initial IV s2c: HASH(K || H || "B" || session_id) Encryption key c2s: HASH(K || H || "C" || session_id) Encryption key s2c: HASH(K || H || "D" || session_id) Integrity key c2s: HASH(K || H || "E" || session_id) Integrity key s2c: HASH(K || H || "F" || session_id)
Option 2) following SSH logic
(2a) K = HASH(K_CL || K_PQ) or (2b) K = HMAC-HASH(K_PQ, K_CL) or (2c) K = HMAC-HASH(0, K_CL || K_PQ) Initial IV c2s: HASH(K || H || "A" || session_id) Initial IV s2c: HASH(K || H || "B" || session_id) Encryption key c2s: HASH(K || H || "C" || session_id) Encryption key s2c: HASH(K || H || "D" || session_id) Integrity key c2s: HASH(K || H || "E" || session_id) Integrity key s2c: HASH(K || H || "F" || session_id)
Option 3) using the dualPRF in @dstebila's paper and the Extract-and-Expand logic of TLS, NIST etc
K = HKDF-HASH(0, K_CL || K_PQ) // Extract Initial IV c2s || Initial IV s2c || Encryption key c2s || Encryption key s2c || Integrity key c2s || Integrity key s2c = = HKDF-HASH(K, H || session_id, 6(size(HASH) ) // Expand
Option 3 seems a more substantial revision which might be viewed as too far from the current SSH design. Similarly for (2b) or (2c). Overall it probably would be a good approach for SSH to move from basic hashing everywhere to use proper KDFs with extract/expand, but that might be a separate step from this hybrid draft.
Either of option 1 or 2a is fine I think. As far as I know SSH is not sufficiently performance sensitive for the difference between the two to matter much.
Either of option 1 or 2a is fine I think. As far as I know SSH is not sufficiently performance sensitive for the difference between the two to matter much.
Yes, maybe worth to put in Appendix like you did in the TLS 1.3 draft, but not tackle changing the SSH key schedule.
I like 2a more because it has better performance as Torben has pointed out.
Additionally on option 3: would almost definitely require modifications in SSH software. How big a delta depends though.
EDNOTE added and paragraphs consolidated in commit# 50712db862b7e23b36bd1721284571d22738678d
Currently the draft says
As per rfc8731, I don't think we need to hash the classical and PQ shared secrets to get K which will be used to derive keys.
In RFC8731, the Curve448 shared secret can be 56 bytes, so they don't hash it to 32. I think we should change our K to just be the concatenation of the classical and PQ shared secret. It will be much bigger than 56 bytes, but it still can be used to generate the keying material.
We should add an EDNOTE explaining the Extract and Expand and using a hash in SSH too and discuss the topic in the WG.