cloudflare / quiche

🥧 Savoury implementation of the QUIC transport protocol and HTTP/3
https://docs.quic.tech/quiche/
BSD 2-Clause "Simplified" License
9.18k stars 698 forks source link

Handshake failure waits 3 seconds before connection closes. #1488

Open tegefaulkes opened 1 year ago

tegefaulkes commented 1 year ago

I'm running some tests to make sure that when a connection fails during a handshake to see if everything is handled as expected and the connection is cleaned up. What I'm expecting is when the handshake fails, a close frame is sent, both sides enter a draining state and then close and i'm expecting this all to happen relatively quickly.

This is working as expected in most cases. Except for one edge case. When the server provides a self signed ed25519 certificate and the client rejects this with a TlsFail error due to it being self signed. The client sends a close frame as expected and enters the draining state. The server processes this and enters the draining state as expected. But It waits for about 3 seconds for the timeout() and on_timeout() handling before entering the closed state. The client connection enters the closed state very quickly.

In contrast, with the same test but using a RSA signed certificate, The same process happens but the server connection doesn't wait the 3 seconds the ed25519 example does.

For reference, here are the packets that are sent over the network.

// server is the 55555 port
// Packets using RSA cert
index, time, src port, bytes, ...
1   0.000000000 57087 1242  Initial, PKN: 0, CRYPTO
2   0.002199797 55555 250   Retry
3   0.003286059 57087 1242  Initial, PKN: 1, CRYPTO
4   0.007740237 55555 1242  Handshake, PKN: 0, CRYPTO
5   0.007906972 55555 412   Handshake, PKN: 1, CRYPTO
6   0.009247469 57087 1242  Handshake, PKN: 0, ACK
7   0.010162245 57087 113   Handshake, PKN: 1, CC

// packets using ed25519 cert
index, time, port, bytes, ...
1   0.000000000 58044 1242  Initial, PKN: 0, CRYPTO
2   0.003642721 55555 250   Retry
3   0.004872892 58044 1242  Initial, PKN: 1, CRYPTO
4   0.007808086 55555 1242  Handshake, PKN: 0, CRYPTO
5   0.009175480 58044 1242  Initial, PKN: 2, CC

So I have a few questions.

  1. Is the extra 3 second timeout before the server connection enters the closed state a bug with quiche? Specifically when using a ed25519 certificate?
  2. Could it be some problem with my internal logic and event handling?
  3. What is actually expected to happen when a handshake fails? from the internal logic perspective and packets perspective. Is there some part of the docs that explain this behaviour?

I can provide some logging of the internal logic but it's not very clean. Right now I just need more context of what is expected to happen.

CMCDragonkai commented 1 year ago

This applies to both ECDSA and Ed25519.

The following is a test case for how state transitions for the quiche connection:

    ECDSA success
      ✓ client connect
      ✓ client dialing
      ✓ client and server negotiation (1 ms)
      ✓ server accept
      ✓ client <-initial- server (1 ms)
      ✓ client is established
      ✓ client -initial-> server (1 ms)
      ✓ server is established
      ✓ client <-short- server
      ✓ client -short-> server (1 ms)
      ✓ client and server established
      ✓ client close (23 ms)
    ECDSA fail verifying client
      ✓ client connect
      ✓ client dialing
      ✓ client and server negotiation (2 ms)
      ✓ server accept
      ✓ client <-initial- server
      ✓ client is established
      ✓ client -initial-> server (2 ms)
      ✓ client <-handshake- server (2 ms)
      ✓ client and server close (14 ms)
    ECDSA fail verifying server
      ✓ client connect
      ✓ client dialing
      ✓ client and server negotiation (2 ms)
      ✓ server accept
      ✓ client <-initial- server (2 ms)
      ✓ client -initial-> server (3 ms)
      ✓ client and server close (2995 ms)
    Ed25519 success
      ✓ client connect
      ✓ client dialing
      ✓ client and server negotiation (1 ms)
      ✓ server accept (1 ms)
      ✓ client <-initial- server
      ✓ client is established (1 ms)
      ✓ client -initial-> server
      ✓ server is established
      ✓ client <-short- server (1 ms)
      ✓ client -short-> server
      ✓ client and server established (1 ms)
      ✓ client close (23 ms)
    Ed25519 fail verifying client
      ✓ client connect (1 ms)
      ✓ client dialing
      ✓ client and server negotiation (1 ms)
      ✓ server accept (1 ms)
      ✓ client <-initial- server
      ✓ client is established
      ✓ client -initial-> server (2 ms)
      ✓ client <-handshake- server (2 ms)
      ✓ client and server close (13 ms)
    Ed25519 fail verifying server
      ✓ client connect
      ✓ client dialing
      ✓ client and server negotiation (2 ms)
      ✓ server accept
      ✓ client <-initial- server (2 ms)
      ✓ client -initial-> server (2 ms)
      ✓ client and server close (2997 ms)

We can see that at the end for when client fails verifying the server, the server connection's timeout is about 3 seconds. This does not happen for RSA though.