libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.56k stars 273 forks source link

Add quic-noise spec. #351

Closed dvc94ch closed 3 years ago

dvc94ch commented 3 years ago

Closes #315 Related-to https://github.com/libp2p/rust-libp2p/pull/2144

dvc94ch commented 3 years ago

It is specified in the section handshake session. It explains how 0rtt, 1rtt and nextrtt keys are derived

dvc94ch commented 3 years ago

So the cipher is described here:

For fast authenticated encryption a chacha8poly1305 cipher is used.

We can discuss the choice of cipher, and I can link to previous discussions on the topic.

No header protection is used, header protection is considered controversial and it's utility questionable. (I can dig up a paper analyzing quic's security and making this claim). I guess I should mention that in the spec. Besides ossification is not something I'm concerned about for quic-noise. If anything my concern would be that it works at all.

dvc94ch commented 3 years ago

So regarding cipher choice:

  1. Unless you can guarantee that all platforms have hardware support for aes-gcm stick with chacha-poly1305 [0]
  2. hardware accelerated aes-gcm performance is comparable to chacha20poly1305 (this can be verified in a benchmark)
  3. chacha12poly1305 is considered secure and used in the rand crate [1]
  4. chacha8poly1305 is a bit more questionable but there is no "real" argument against it and is recommended in too much crypto [2][1]
  5. chacha8poly1305 is considerably faster than hardware accelerated aes-gcm on all platforms (this can be verified in a benchmark)
  6. cipher performance is important for quic as it is responsible for around 50% of the cpu time (this can be verified in a benchmark)

EDIT: if chacha6 is broken in 10y we can still upgrade to chacha12 by incrementing the version.

marten-seemann commented 3 years ago

These are a few points off the top of my head (without even looking into the QUIC RFCs), that one would need to be address in a document that could rightfully claim to be a specification of Noise+QUIC. I'm sure that if I started re-reading the RFCs, I'd find a lot more problems.

Basically, what one would have to do is produce a document along the lines of RFC 9001.

Choise of IK handshake

As you point out, it's required to know your peer's public key to use this handshake pattern. This might be fine, but I'd like to point out that in libp2p you don't always (though most of the time you do) know the peer's identity before connecting to its multiaddr. To my knowledge, rust-libp2p currently already implements this. Using IK means that it's not possible to use Noise+QUIC in that case.

Mapping to packet types

QUIC defines a couple of different packet types: Initial, Retry, Handshake, 0-RTT and 1-RTT packets (and Version Negotiation). It's not clear how the IK handshake maps onto these packet types. For one, it seems like Retry won't be possible any more (I'm guessing, since there's nothing in the spec about it). Then it's not clear how / if one would still use Initial packets, or start the handshake using only Handshake packets.

Choice of key derivation

I have to admit, I'm not very familiar with how "normal" Noise derives its keys, but it seems like you're using a special key derivation here based on xoodyak. Not being a cryptographer, I have no idea if this makes sense or if it doesn't. However, it feels like this is a stark deviation from the Noise motto "pick your handshake pattern and just use Noise as a black box and it will be secure". This feels a lot like rolling out our own handshake protocol.

Choice of cipher suite

chacha8poly1305 is a bit more questionable but there is "real" argument against it and is recommended in too much crypto

This is a significant deviation from TLS 1.3, which uses chacha20poly1305. Why is that one good enough for TLS 1.3 and QUIC, but not good enough for Noise+QUIC? Anyway, switching to a different cipher suite would require a comprehensive security analysis, especially with regards to header protection (see separate section below) and key updates (see separate section below).

Header protection

No header protection is used, header protection is considered controversial and it's utility questionable.

I would be strongly opposed to removing this feature from QUIC. How is header protection controversial anyway? The QUIC working group has had many contentious discussions about many topics, but header protection was not among them. The utility of header protection is that it hides header fields from middleboxes, thereby preventing ossification of the protocol. Also, if header protection was removed, this definitely belongs in the spec.

Framing of handshake messages

I assume you intend to use CRYPTO frames, but the spec doesn’t say anything about this. Also, due to the different packet types, you effectively have 3 different CRYPTO byte streams, so it needs to be defined how they’re used and which message goes where. You’ll also need to take special care about the boundaries, as RFC 9000 does.

Application Protocol Negotiation

QUIC requires the use of ALPN to negotiate the application protocol and avoid protocol confusion attacks. It looks like this was removed entirely. This might be an acceptable choice for Noise+QUIC/libp2p variant, but probably not for a general purpose Noise+QUIC protocol.

Transport Parameters

QUIC uses TLS extensions to send the QUIC transport parameters. The Noise variant will need to send transport parameters some other way, but the spec doesn’t say anything about this either. Absent of any meaningful specification, I assume that the || means concatenation. Maybe you intend to shove them into the CRYPTO frame? That might not be a bad solution. But then you also use || to concatenate 0-RTT and 1-RTT data, and that wouldn't make any sense at all, given that QUIC comes with native stream multiplexing.

Discarding of keys used during the handshake

When are keys used during the handshake discarded? I can't find anything about this topic in the spec. Is this completely analogous to how TLS does this? Note that this can have significant implication on loss recovery during the handshake, and that in QUIC we had to introduce a HANDSHAKE_DONE frame pretty late in the process just to prevent a deadlock associated with this. It would be crucial to translate what concepts like Handshake Confirmed means in the context of Noise+QUIC.

Key Updates

How do key updates work? The spec also remains silent about this topic.

0-RTT

Peers are required to remember certain transport parameters from the original connection. In TLS, a server can do so statelessly by encoding them into the session ticket. Losing that capability would severely limit a server’s ability to offer 0-RTT. Unfortunately, the spec doesn’t even mention 0-RTT, so I have no idea how this is intended to work. Sending 0-RTT data on a newly dialed connection, as the IK handshake allows, would require a major overhaul of the QUIC protocol: In RFC 9000, before initiating any streams and sending any data, you need to receive the peers transport parameters (or remember them from the previous session) to learn how many streams you're allowed to open and how much data you're allowed to send. How is this supposed to work in Noise+QUIC? This will require some careful design work.

marten-seemann commented 3 years ago

Now this became a lot longer than I initially planned. In fact, my list of things missing from the spec is now more than 3x longer than the spec itself. Given that, I don't think it makes sense to resolve these deficiencies using the GitHub PR review process, which is why I'm going to close this PR now.

dvc94ch commented 3 years ago

Well, I assume you're not really interested on a response to those points. But thank's for closing the PR so quickly, saves me wasting my time. While I disagree with you I honestly appreciate that very much.

dvc94ch commented 3 years ago

Choise of IK handshake As you point out, it's required to know your peer's public key to use this handshake pattern. This might be fine, but I'd like to point out that in libp2p you don't always (though most of the time you do) know the peer's identity before connecting to its multiaddr. To my knowledge, rust-libp2p currently already implements this. Using IK means that it's not possible to use Noise+QUIC in that case.

Yes that is correct. I'm aware of that and this choice was made to support 0-rtt. This was a requested feature here [0].

Mapping to packet types QUIC defines a couple of different packet types: Initial, Retry, Handshake, 0-RTT and 1-RTT packets (and Version Negotiation). It's not clear how the IK handshake maps onto these packet types. For one, it seems like Retry won't be possible any more (I'm guessing, since there's nothing in the spec about it). Then it's not clear how / if one would still use Initial packets, or start the handshake using only Handshake packets.

The retry mechanism is the same as for tls. I saw no need to modify it. In the handshake session the Initial, Handshake crypto frames are specified and how the keys for the packets are derived. There is nothing special about the 0-rtt and 1-rtt packets other than they're encrypted with the 0-rtt and 1-rtt keys.

Choice of key derivation I have to admit, I'm not very familiar with how "normal" Noise derives its keys, but it seems like you're using a special key derivation here based on xoodyak. Not being a cryptographer, I have no idea if this makes sense or if it doesn't. However, it feels like this is a stark deviation from the Noise motto "pick your handshake pattern and just use Noise as a black box and it will be secure". This feels a lot like rolling out our own handshake protocol.

There were a few issues in mapping noise directly to quic. One of them is that snow provides no api for adding associated data to a message which itself would not be that hard to add [1]. Another one is that a quic packet may have multiple frames. To let noise do it's thing you'd need to add the other frames as a payload to the (encrypted) crypto frame. Safely extracting keys from the noise state that could be used for packet protection is indeed something that would require some security analysis beyond what I can do. What I can do is pick something and use it how it was intended to be used. Since xoodyak is being used the way it was designed to I'm pretty confident that this is fine.

Choice of cipher suite chacha8poly1305 is a bit more questionable but there is "real" argument against it and is recommended in too much crypto This is a significant deviation from TLS 1.3, which uses chacha20poly1305. Why is that one good enough for TLS 1.3 and QUIC, but not good enough for Noise+QUIC? Anyway, switching to a different cipher suite would require a comprehensive security analysis, especially with regards to header protection (see separate section below) and key updates (see separate section below).

we can add more rounds if that makes people more comfortable. I'm not really loosing any sleep because of it. We can also make the number of rounds to be configurable.

Header protection No header protection is used, header protection is considered controversial and it's utility questionable. I would be strongly opposed to removing this feature from QUIC. How is header protection controversial anyway? The QUIC working group has had many contentious discussions about many topics, but header protection was not among them. The utility of header protection is that it hides header fields from middleboxes, thereby preventing ossification of the protocol. Also, if header protection was removed, this definitely belongs in the spec.

also not loosing much sleep over this one. but we could easily reuse the tls mechanism and specify a key in the spec.

Framing of handshake messages I assume you intend to use CRYPTO frames, but the spec doesn’t say anything about this. Also, due to the different packet types, you effectively have 3 different CRYPTO byte streams, so it needs to be defined how they’re used and which message goes where. You’ll also need to take special care about the boundaries, as RFC 9000 does.

not entirely sure what you mean here, but I'm sure it can be clarified. I sometimes implement specs I don't usually write any.

Application Protocol Negotiation QUIC requires the use of ALPN to negotiate the application protocol and avoid protocol confusion attacks. It looks like this was removed entirely. This might be an acceptable choice for Noise+QUIC/libp2p variant, but probably not for a general purpose Noise+QUIC protocol.

good point. yes some alpn mechanism could/should be added. opened an issue for this [3]

Transport Parameters QUIC uses TLS extensions to send the QUIC transport parameters. The Noise variant will need to send transport parameters some other way, but the spec doesn’t say anything about this either. Absent of any meaningful specification, I assume that the || means concatenation. Maybe you intend to shove them into the CRYPTO frame? That might not be a bad solution. But then you also use || to concatenate 0-RTT and 1-RTT data, and that wouldn't make any sense at all, given that QUIC comes with native stream multiplexing.

yes, the transport parameters are added to the crypto frame and || means concatenation. also needs to be clarified.

Discarding of keys used during the handshake When are keys used during the handshake discarded? I can't find anything about this topic in the spec. Is this completely analogous to how TLS does this? Note that this can have significant implication on loss recovery during the handshake, and that in QUIC we had to introduce a HANDSHAKE_DONE frame pretty late in the process just to prevent a deadlock associated with this. It would be crucial to translate what concepts like Handshake Confirmed means in the context of Noise+QUIC.

Also nothing special was done here. There was one issue with the state machine not working due to assumptions about tls. [2]

Key Updates How do key updates work? The spec also remains silent about this topic.

  | Ratchet()
  | initiator-next-1rtt-key = SqueezeKey(32)
  | responder-next-1rtt-key = SqueezeKey(32)

you can continue squeezing keys from the handshake state. quinn takes care of this, would need to look into the exact conditions when this gets called, but the code is the same as for tls.

0-RTT Peers are required to remember certain transport parameters from the original connection. In TLS, a server can do so statelessly by encoding them into the session ticket. Losing that capability would severely limit a server’s ability to offer 0-RTT. Unfortunately, the spec doesn’t even mention 0-RTT, so I have no idea how this is intended to work. Sending 0-RTT data on a newly dialed connection, as the IK handshake allows, would require a major overhaul of the QUIC protocol: In RFC 9000, before initiating any streams and sending any data, you need to receive the peers transport parameters (or remember them from the previous session) to learn how many streams you're allowed to open and how much data you're allowed to send. How is this supposed to work in Noise+QUIC? This will require some careful design work.

quinn somehow takes care of this. I'd have to investigate it more closely. I think having conservative defaults should work.

DemiMarie commented 3 years ago

Well, I assume you're not really interested on a response to those points. But thank's for closing the PR so quickly, saves me wasting my time. While I disagree with you I honestly appreciate that very much.

What makes you assume this?

dvc94ch commented 3 years ago

Now this became a lot longer than I initially planned. In fact, my list of things missing from the spec is now more than 3x longer than the spec itself. Given that, I don't think it makes sense to resolve these deficiencies using the GitHub PR review process, which is why I'm going to close this PR now.

This didn't sound like, I think there is room for improvement, let's try to make it better. The first comment was hard to guess at what the problem was. The second one was more constructive, so I could actually understand the issues. Then the follow up was let's close it because it is unsalvageable which gave me the impression that there was more interest in finding reasons to close it than finding ways to improve it.

DemiMarie commented 3 years ago

Now this became a lot longer than I initially planned. In fact, my list of things missing from the spec is now more than 3x longer than the spec itself. Given that, I don't think it makes sense to resolve these deficiencies using the GitHub PR review process, which is why I'm going to close this PR now.

This didn't sound like, I think there is room for improvement, let's try to make it better. The first comment was hard to guess at what the problem was. The second one was more constructive, so I could actually understand the issues. Then the follow up was let's close it because it is unsalvageable which gave me the impression that there was more interest in finding reasons to close it than finding ways to improve it.

The way I read it was, “GitHub issues aren’t a good place for this kind of discussion; let’s discuss this elsewhere.”

dvc94ch commented 3 years ago

I guess if you squint your eyes you could arrive at that conclusion, and I'm not uninclined to believe it. However to be able to interpret it the way, a show of good will would have been to suggest an alternative place to discuss it.

dvc94ch commented 3 years ago

However constructive criticism is only constructive if it is intended that way. So at least allowing for a rebuttal before closing the PR shows that you are actually interested in discussing those points.

rklaehn commented 3 years ago

For the record, I have to say that I also find the dismissal rude and condescending.

whyrusleeping commented 3 years ago

A dismissal would be closing it without considering the contents of the PR, marten left very detailed consideration of the contents of the PR, then requested that we not use a pull request to have this discussion. Closing the PR immediately does track as a sensical next course of action, though I would have personally waited for @dvc94ch to close it himself (@marten-seemann, feedback for the future).

@dvc94ch

However to be able to interpret it the way, a show of good will would have been to suggest an alternative place to discuss it.

This is another bad-faith argument, marten commented in your other issue thread ten minutes later, providing a clear place to continue the conversation.

dvc94ch commented 3 years ago

On second reading your interpretation of events does seem plausible

dvc94ch commented 3 years ago

It was not clear to me how to respond to those points. If you get a list of "deficiencies" and then move the conversation somewhere else, that doesn't make it easy to address those points or request for clarification. Also there is the concept of home advantage when engaging in a debate or in sports. So pointing out deficiencies and then moving the discussion to a different avenue is similar to the home team deciding to play the second half time in a different court. We can argue to what degree the other team was notified of this change in avenue, it wasn't obvious to me that he expected me to reply in the issue.