libp2p / specs

Technical specifications for the libp2p networking stack
https://libp2p.io
1.56k stars 273 forks source link

Noise Pipes fallback issues #246

Closed yusefnapora closed 3 years ago

yusefnapora commented 4 years ago

The way the Noise Pipes compound handshake protocol is currently spec'd out and implemented has an issue that is causing me to chase my tail a bit.

For a quick recap of terms, our Noise transport uses two main handshake patterns, XX and IK. The XX handshake takes 1.5 roundtrips and 3 handshake messages, while IK takes only one roundtrip and 2 messages. IK can fail, either because the other peer doesn't support it, or because their static key has rotated.

When IK fails, we "fallback" to using what is essentially the XX pattern. The Noise framework spec fallback modifier section describes the XXfallback variation on XX that can be used to recover from a failed IK attempt.

When I was writing the noise-libp2p spec, I misunderstood how the fallback process is supposed to work. In the bit on Noise Pipes flow, I wrote:

If Alice sends an XX message, she will always receive an XX-compatible response. However, if Alice sends an IK message, Bob may reply with either the second IK message, or the first message in the XXfallback sequence (aka the second message in XX).

The thing is, XX and XXfallback aren't actually directly compatible with each other. Since they're defined in the Noise framework spec as separate handshake patterns, they are supposed to have different Noise protocol names - Noise_XX_25519_ChaChaPoly_SHA256 vs Noise_XXfallback_25519_ChaChaPoly_SHA256. The hash of the protocol name is used to initialize both parties handshake states, so they have to match to complete the handshake.

The go implementation does (mostly) work though, because we're not setting the correct protocol name for XXfallback. The Noise Explorer generated code doesn't directly support XXfallback, so it was "spliced in" by ChainSafe when they kicked off the implementation, but the protocol name is always XX. I'm pretty sure JS is doing the same. So things work at the moment, but only because we're not quite correctly following the Noise framework spec.

Well, I say "things work" now, but there is one scenario that's broken in Go (and possibly JS, need to test). If the initiating peer (Alice) supports Noise Pipes, but responder Bob doesn't, the handshake fails. I think the failure is caused by our not-quite XXfallback implementation leading to Alice and Bob not agreeing on the contents of the first message in the sequence.

I came up with a way to "make it work" with the current approach, but it's super awkward. I would prefer to just do XXfallback correctly, which would actually be pretty easy in Go, since it's supported by https://github.com/flynn/noise, which I've discovered is much nicer to work with than the Noise Explorer code.

There is a new complication introduced by doing XXfallback correctly though. When Bob is responding to an incoming handshake, he needs to determine whether it's an IK or XX attempt. Right now, we always try IK, then fallback if that fails. However, if XX and XXfallback aren't directly compatible, that won't work. Alice will send an XX message, but get an XXfallback response.

There are two ways that I can think to deal with that.

  1. The initial IK message is larger than the minimum size of an initial XX message, so we could distinguish between them based on size, instead of always trying to decrypt as IK. Then Alice would never receive an XXfallback response to an XX message.

  2. We always use XXfallback, even for full 1.5 RTT "XX-style" handshakes. For the full handshake, Alice generates a new ephemeral key and sends it over the wire, but doesn't incorporate the outgoing message into her handshake state. Bob reads the key and initiates an XXfallback handshake to Alice, using the key he read as pre-message knowledge. Alice initiates her handshake as an XXfallback responder, using the key she generated as her local ephemeral key. This XXfallback all the time pattern is described in the handshake indistinguishability section of the Noise framework spec.

If we don't care about handshake indistinguishability (whether outside observers can guess whether we're doing XX or IK), then option 1 is easy. It's also nice because not all Noise libraries support fallback patterns. If we distinguish based on message size, we could still use those libraries and just not implement Noise Pipes, and we'd still be compatible with peers that do support Pipes.

With option 2, everyone has to support XXfallback whether they support Noise Pipes or not.

cc the noise spec interest group: @raulk, @tomaka, @romanb, @shahankhatch, @Mikerah, @djrtwo, @dryajov, @mpetrunic, @AgeManning, @morrigan, @araskachoi, @mhchia

yusefnapora commented 4 years ago

Just to clarify, above when I say that doing XXfallback correctly makes it not "compatible" with XX, I mean that if Alice initiates an XX handshake to Bob, Bob can't reply with an XXfallback message, because the protocol name affects the handshake hash and Alice won't be able to decrypt the reply unless she also reinitializes her handshake state to use the XXfallback pattern.

In short, if XX and XXfallback are distinct patterns, and we support both of them, both parties need to agree on which one is being used.

I'm going to try writing up all the Noise Pipes scenarios to see if it's possible to determine what to use based on message size (option 1 from above).

No one supports Noise Pipes

This one is easy, since there's no fallback involved or decisions to make.

Alice and Bob both support Pipes

This works, since an initial XX message is smaller than an initial IK message, and Bob can always know how to respond. Bob always responds to an XX message with XX, and responds to IK with either IK or XXfallback.

Alice does not support Pipes, but Bob does

This one works fine, since Bob can tell whether to send XX, and Alice will never initiate an IK to Bob.

Alice supports Pipes, but Bob does not

This is the problematic case. If XX and XXfallback are two distinct and incompatible protocols, Alice could receive three possible responses to an attempted IK handshake:

There doesn't seem to be way for Alice to distinguish the latter two cases from each other based on the message content.

So now I'm thinking that option 1 isn't viable, and if we want to use the correct protocol naming rules we need to go with option 2 (always do XXfallback). This works because we never need to distinguish between XX and XXfallback, just between IK and XXfallback, which we can do with trial decryption.

I've actually already implemented this option in Go using the flynn/noise library and verified that it works. I just wanted to avoid it if possible, because it forces all implementations to support the XXfallback pattern, even if they don't want to support Noise Pipes.

I'll plan to clean that branch up and make sure all the cases are tested, then I'll make a PR to go-libp2p-noise and link it back here.

raulk commented 4 years ago

The fact that we made Noise Pipes optional is, IMO, what creates the ambiguity.

If we only supported Noise Pipes, we could follow the recommendations in 10.5. Handshake indistinguishability. But given that we need to juggle both XX and IK/XXfallback, is what makes it difficult. Not just for the reasons laid out above, but because an IK-supporting Alice has no way of knowing if Bob supports IK to begin with. Therefore, it is never safe to use IK even if you support it.

If we're willing to stick with Noise Pipes (which is in dispute in #249), we'll need to make it compulsory and remove the separate/lone-standing XX pathway entirely.

Re: the protocol name passed at InitializeSymmetric, and what happens during a fallback. I found these docs on the C noise implementation clarifying, and @flynn's Go implementation mimics the C implementation.

Alice initiates a full handshake (i.e. XX equivalent)

  1. Alice calls InitializeSymmetric(XXfallback), sends the first message.
  2. On receiving the message, Bob attempts to decrypt the first message as an IK (InitializeSymmetric(IK)), with its static key, and fails.
  3. Bob falls back to XXfallback by reinitialising his handshake state and feeding in the pre-messages, fast-forwarding to step 2 of the handshake.
  4. Bob sends message 2 of the XXfallback handshake.
  5. Life goes on; both parties are using the same protocol name.

Alice initiates a succeeding IK handshake

  1. Alice has a static key for Bob, and calls InitializeSymmetric(IK), sends the first message.
  2. On receiving the message, Bob attempts to decrypt it and succeeds.
  3. Life goes on; both parties are using the same protocol name.

Alice initiates a failing IK handshake

  1. Alice has a static key for Bob, and calls InitializeSymmetric(IK), sends the first message.
  2. On receiving the message, Bob attempts to decrypt the first message as an IK (InitializeSymmetric(IK)), with its static key, and fails.
  3. Bob falls back to XXfallback by reinitialising his handshake state and feeding in the pre-messages, fast-forwarding to step 2 of the handshake.
  4. Bob sends message 2 of the XXfallback handshake.
  5. Alice tries to decrypt Bob's response as if it were message 2 of IK; it fails, and Alice reinitialises its handshake state to XXfallback too.

@yusefnapora -- does this answer clarify how we'd eventually resolve this discrepancy, if we decided to stick with Noise Pipes (discussing in #249)?

yusefnapora commented 4 years ago

@raulk that's exactly right, yes. If we did keep noise pipes, the full handshake has to be XXfallback as well, as far as I can tell.

jacobheun commented 4 years ago
  1. Bob falls back to XXfallback by reinitialising his handshake state and feeding in the pre-messages, fast-forwarding to step 2 of the handshake.
  2. Bob sends message 2 of the XXfallback handshake.
  3. Alice tries to decrypt Bob's response as if it were message 2 of IK; it fails, and Alice reinitialises its handshake state to XXfallback too.

Based on the fallback modified and how the handshake patterns are defined I don't think this is quite correct. I believe this should be:

  1. When Bob fails to decrypt the IK handshake he reinitiates with a XXFallback handshake as the initiator.
  2. Bob sends the 1st message in the XXfallback handshake, with prior knowledge of Alice's Ephemeral Key.
  3. Alice tries to decrypt Bob's response as if it were message 2 of IK; it fails, and Alice reinitializes her handshake state to XXfallback, as the responder.
IK:                   
  <- s  // Alice has prior knowledge of Bob's static key
  ...
  -> e, es, s, ss  // Alice sends first message of IK as initiator          
  <- e, ee, se  // Bob responds with second message of IK as responder

XXfallback:                   
  -> e  // Bob has prior knowledge of Alice's ephemeral key
  ...
  <- e, ee, s, es  // Bob sends first message of XXfallback as initiator          
  -> s, se  // Alice responds with second message of XXfallback as responder
djrtwo commented 4 years ago

With the removal of IK handshake (#249), is XXFallback to be removed as well?

mpetrunic commented 4 years ago

With the removal of IK handshake (#249), is XXFallback to be removed as well?

Yup, it's removed already in go and js afaik

AgeManning commented 4 years ago

Yep, it's not even in rust afaik

mxinden commented 3 years ago

I am closing here since https://github.com/libp2p/specs/pull/260 updated the specification to settle on the XX handshake pattern only. Please comment or re-open in case I am missing something.