Mbed-TLS / mbedtls

An open source, portable, easy to use, readable and flexible TLS library, and reference implementation of the PSA Cryptography API. Releases are on a varying cadence, typically around 3 - 6 months between releases.
https://www.trustedfirmware.org/projects/mbed-tls/
Other
5.54k stars 2.6k forks source link

Intermittent failure of "Sample: dtls_server, openssl client, DTLS 1.2" #9652

Closed gilles-peskine-arm closed 1 month ago

gilles-peskine-arm commented 1 month ago

The ssl-opt test case "Sample: dtls_server, openssl client, DTLS 1.2", added by https://github.com/Mbed-TLS/mbedtls/pull/9638 and https://github.com/Mbed-TLS/mbedtls/pull/9541, is failing intermittently on the CI.

I didn't observe this failure during development, but since it's been merged, it's failed several times.

Sample logs: all_u16-test_psa_crypto_config_reference_ecc_no_bignum-o-srv-892.log.txt all_u16-test_psa_crypto_config_reference_ecc_no_bignum-o-cli-892.log.txt

The logs show a successful connection (handshake and two-way data transfer). Then the server receives an extra packet on the same port that it doesn't like. The client logs look normal. The server logs:

# Sample: dtls_server, openssl client, DTLS 1.2
../programs/ssl/dtls_server
  . Seeding the random number generator... ok

  . Loading the server cert. and key... ok
  . Bind on udp/*/4433 ... ok
  . Setting up the DTLS data... ok
  . Waiting for a remote connection ... ok
  . Performing the DTLS handshake... hello verification requested
  . Waiting for a remote connection ... ok
  . Performing the DTLS handshake... ok
  < Read from client: 15 bytes read

GET / HTTP/1.0

  > Write to client: 15 bytes written

GET / HTTP/1.0

  . Closing the connection... done
  . Waiting for a remote connection ... ok
  . Performing the DTLS handshake... failed
  ! mbedtls_ssl_handshake returned -0x7700

Last error was: -30464 - SSL - An unexpected message was received from our peer

  . Waiting for a remote connection ...
mpg commented 1 month ago

Looking at a wireshark capture of this test running locally, it appears the client is sending an encrypted alert (most probably close_notify) right after its ApplicationData record, which makes complete sense considering how it's invoked (echo "..." | openssl s_client -> send 1 message then close the connection).

I'm wondering if this could be a race condition where in some cases, the client's close_notify would only reach the server after it has closed the connection and started listening for new connections - expecting a ClientHello, so an encrypted close_notify is going to be quite unexpected indeed.

I think the best way to confirm what's happening on the CI when the test is failing would be to insert a proxy in the middle of the connection, but as you noted in the initial PR adding those tests, that's a bit complicated due to fixed port number in those sample programs.