MatrixAI / js-quic

QUIC Networking for TypeScript & JavaScript
https://matrixai.github.io/js-quic/
Apache License 2.0
13 stars 1 forks source link

Create `TLS` cert arbitraries for testing #8

Closed tegefaulkes closed 1 year ago

tegefaulkes commented 1 year ago

Specification

We want the ability to generate the TLS cert chain and private key PEMs for testing. We should also create fast-check arbitraries for this.

We need to replicate the KeyPair generation, x509 certificate creation and PEM format from the Polykey methods. Since these will be used for testing, the types can be striped down to primitives and any extra information can be as placeholder as possible. Refer to the keys domain in Polykey for how these are created.

Additional context

Tasks

  1. [x] Create simplified method for generating key pairs.
  2. [x] Create simplified method for generating x509 certificates
  3. [x] Create simplified method for generating cert chain and private key PEM formats.
CMCDragonkai commented 1 year ago

Important things I've already explained.

  1. https://github.com/MatrixAI/js-quic/issues/2#issuecomment-1506230090
  2. https://github.com/MatrixAI/js-quic/issues/2#issuecomment-1506230852
  3. https://github.com/MatrixAI/js-quic/issues/2#issuecomment-1506231602
  4. https://github.com/MatrixAI/js-quic/issues/2#issuecomment-1506232459 maybe you need the webcrypto polyfill.
CMCDragonkai commented 1 year ago

It is critical to test a ed25519 signed cert chain as this is what is being used in PK.

tegefaulkes commented 1 year ago

Most of this is done now. I've lifted a bunch of functions from Polykey and stripped them down a little. Mostly removing the polykey extensions since they're not really used here.

The QUICClient tests are using fast-check to generate the TLS config now. but they're failing since the public keys are not properly derived.

The last step is to derive the public key from the private key. In Polykey it's using sodium-native to do this but we want to avoid using that here. One lead is to use the noble-ed25519 implementation to derive the key.

tegefaulkes commented 1 year ago

I'm looking into using the noble-ed25519 to derive the key. On the top level it's doing some simple stuff. but one stage relies heavily on the Point class which makes up most of the code. if I lift just the parts I need then I'm pretty much taking the whole thing.

So i may as well just import it. but attempting to import it results in an error. The noble-ed25519 module only supports ESM importing which we do not support?.

tegefaulkes commented 1 year ago

This is done now. Deriving the public key is done using sodium-native the same way as Polykey. sodium-native has been added as a dev dependency.

CMCDragonkai commented 1 year ago

Great! Did you test cert chains too?

tegefaulkes commented 1 year ago

Yeah, tested with random length cert chains with a min of 1 cert. The connections still timeout for me but I think that's a separate issue. I've confirmed that the config is generated without error.

tegefaulkes commented 1 year ago

Since there is a problem with the generated certs I'm going to re-open this issue.

What I know so far is that

  1. The QUICClient tests are passing when loading the cert from a file. This file is a simple cert with a RSA key.
  2. The tests are passing when loading the certs from memory using the same certs as 1. So we can conclude that the config changes are working.
  3. The tests fail with timeout when using the test-generated certs. These certs are a stripped down version of the Polykey certs.

So the problem could be

  1. The certs are generated badly
  2. Some of the stripped out information may be important.
  3. The generated certs are using an ED25519 key.

To verify the problem we can take two approaches to start.

  1. Verify the generated cert using the step program to inspect the contents of the certs.
  2. Generate a cert using a ED25519 (OPK) key and test using that.

Beyond that I need to narrow down why the tests are timing out and try to extract more useful errors in this case.

tegefaulkes commented 1 year ago

I did some more digging just to work out how to debug failures like this. I have two failure examples.

  1. Using a cert that we know works with the RSA key. If I set verify_peer to true it fails with TlsFail like due to it being self signed. In this case I think the local_error is 304.
  2. Using the ed25519 signed cert we end up with tlsFail with an error code of 296.

Looking over the docs I can't find a reference for these codes anywhere but looking at the open ssl defined codes I think they correspond to ...

  1. SSL_R_CIPHER_MISMATCH_ON_EARLY_DATA 304
  2. SSL_R_DUPLICATE_SIGNATURE_ALGORITHM 296

I'm not sure that theses are the right codes. They don't really make sense in this context.

CMCDragonkai commented 1 year ago

I've verified that Ed25519 certs don't currently work. RSA and ECDSA certs work.

You can confirm with:

step certificate create localhost localhostrsa.crt localhostrsa.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty RSA

step certificate create localhost localhostec.crt localhostec.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty EC

step certificate create localhost localhosted.crt localhosted.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty OKP

Only the Ed25519 cert fails. In particular the QUICClient is timing out whereas the other certs connect fine. The reason there's a timeout is actually because there's an error sent back to the client that we are not handling. So we should fix that and make sure that the error is not just a timeout error, but a more specific error to handshake failure.

I checked on wireshark why it's failing.

Normally we send an initial packet, receive a retry packet, resend initial packet, then receive handshake.

What's happneing instead is that we send initial packet, receive retry packet, resend initial packet, and now the server responds with a initial packet with CONNECTION_CLOSE frame with the error being CRYPTO_ERROR.

The TLS Alert Description is "Handshake Failure".

I suspect the boring library does not yet support Ed25519 certificates.

image

We also updated to the latest boring and quiche libraries, and this is the case too.

CMCDragonkai commented 1 year ago

@tegefaulkes yes 296 is CRYPTO_ERROR as you can see above. If you use wireshark and log out the keys with the logKeys option, and configure wireguard as per #1 then you see the same thing.

CMCDragonkai commented 1 year ago

Upstream issue: https://github.com/cloudflare/boring/issues/113.

Cloudflare's boring library itself is just a binding around the Google's boringssl library: https://github.com/google/boringssl

It's possible we may just need to enable or update the Rust's boring package somehow to include or enable ed25519 support in boringssl.

CMCDragonkai commented 1 year ago

There is a bug in the failure condition of TLS verification. The verifyPeer when switched on client or server results in a failure of the client connection and the server connection continuously attempting to send UDP packets back, possibly handshake frames. This has something to do with the way we are handling TLS failures, and the client side not expliciting sending a shutdown frame to the QUICServer.

CMCDragonkai commented 1 year ago

We need to have tests that test with verifyPeer(true).

This just means making use of the CA options in the config.

We can generate local certificates with a certificate authority. The easiest way is to put the local certificate used by the remote peer as a certificate authority.

So here are some possible tests:

  1. If the peer cert is not in the CA, then the connection should fail gracefully.
  2. If the peer cert is in the CA, then the connection should succeed

Do both for client, server, and client & server.

CMCDragonkai commented 1 year ago

These tests can be done for ECDSA certs or RSA certs... for now let's just use step-cli to generate the certs and keys and save the files into the test fixtures.

CMCDragonkai commented 1 year ago

@tegefaulkes do note I forgot to add quiche.verifyPeer(config.verifyPeer); on my branch so make sure you have that in the buildQuicheConfig function.

CMCDragonkai commented 1 year ago

Also I'm not entirely sure if it matters whether the server or the client produces the keylog. I tried on both they seem to have the same effect on wireshark.

CMCDragonkai commented 1 year ago

@tegefaulkes please set clientDefault.verifyPeer to be true and serverDefault.verifyPeer to be false. That's the defaults by quiche and we should preserve the defaults in our code.

CMCDragonkai commented 1 year ago

Tests could have to be correspondingly updated though depending on whether you are testing TLS or not.

tegefaulkes commented 1 year ago

We have a response here https://github.com/cloudflare/quiche/issues/1482 for the failing ED25519 certs.

So we need to enable the ed25519 algo using the boring SslContextBuilder's set_sigalgs_list.

The method takes a string of the algorithms delimited by :, We just need to set ed25519. But we want to support all of the algorithms. Annoyingly the options are not very well documented so I had to dig into the source,

This is what I found

// private key types
RSA
RSA-PSS
PSS
ECDSA

// hash types
SHA1
SHA256
SHA384
SHA512

// Combined like
RSA+SHA256
ECDSA+sha256
ed25519

// TLS 1.3-style names
rsa_pkcs1_md5_sha1
rsa_pkcs1_sha1
rsa_pkcs1_sha256
rsa_pkcs1_sha384
rsa_pkcs1_sha512
ecdsa_sha1
ecdsa_secp256r1_sha256
ecdsa_secp384r1_sha384
ecdsa_secp521r1_sha512
rsa_pss_rsae_sha256
rsa_pss_rsae_sha384
rsa_pss_rsae_sha512
ed25519

// All the available options
SSL_SIGN_RSA_PKCS1_SHA1
SSL_SIGN_RSA_PKCS1_SHA256
SSL_SIGN_RSA_PKCS1_SHA384
SSL_SIGN_RSA_PKCS1_SHA512
SSL_SIGN_RSA_PSS_RSAE_SHA256
SSL_SIGN_RSA_PSS_RSAE_SHA384
SSL_SIGN_RSA_PSS_RSAE_SHA512
SSL_SIGN_ECDSA_SHA1
SSL_SIGN_ECDSA_SECP256R1_SHA256
SSL_SIGN_ECDSA_SECP384R1_SHA384
SSL_SIGN_ECDSA_SECP521R1_SHA512
SSL_SIGN_ED25519

Now we just need to select out of this list the ones we want to support.

CMCDragonkai commented 1 year ago

What is the default list if we didn't configure it at all?

Let's just copy the default list, and then just add on top the ed25519.

tegefaulkes commented 1 year ago

Right now I'm taking what chrome supports and adding ed25519. "ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512"

CMCDragonkai commented 1 year ago

IBM 7.5 only supports these, and Ed25519 only has 64 bytes and so has no further hash size specification.

image

We can follow this as the default.

tegefaulkes commented 1 year ago

Updated to "ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512"

tegefaulkes commented 1 year ago

I'm considering this fixed now.

Changes made

  1. added supportedPrivateKeyAlgos to the config, this sets the supported algos and takes a string listing them. The default is currently "ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512" which allows all available algos except ones using SHA1 hashing.
  2. Tests using fast-check to select a tlsConfig randomly from the example fixtures and the generated ed25519 signed cert chains.