Closed tegefaulkes closed 1 year ago
Important things I've already explained.
It is critical to test a ed25519 signed cert chain as this is what is being used in PK.
Most of this is done now. I've lifted a bunch of functions from Polykey
and stripped them down a little. Mostly removing the polykey
extensions since they're not really used here.
The QUICClient
tests are using fast-check to generate the TLS config now. but they're failing since the public keys are not properly derived.
The last step is to derive the public key from the private key. In Polykey
it's using sodium-native
to do this but we want to avoid using that here. One lead is to use the noble-ed25519
implementation to derive the key.
I'm looking into using the noble-ed25519
to derive the key. On the top level it's doing some simple stuff. but one stage relies heavily on the Point
class which makes up most of the code. if I lift just the parts I need then I'm pretty much taking the whole thing.
So i may as well just import it. but attempting to import it results in an error. The noble-ed25519
module only supports ESM importing which we do not support?.
This is done now. Deriving the public key is done using sodium-native
the same way as Polykey
. sodium-native
has been added as a dev dependency.
Great! Did you test cert chains too?
Yeah, tested with random length cert chains with a min of 1 cert. The connections still timeout for me but I think that's a separate issue. I've confirmed that the config is generated without error.
Since there is a problem with the generated certs I'm going to re-open this issue.
What I know so far is that
QUICClient
tests are passing when loading the cert from a file. This file is a simple cert with a RSA key.Polykey
certs. So the problem could be
ED25519
key.To verify the problem we can take two approaches to start.
step
program to inspect the contents of the certs.ED25519
(OPK) key and test using that.Beyond that I need to narrow down why the tests are timing out and try to extract more useful errors in this case.
I did some more digging just to work out how to debug failures like this. I have two failure examples.
TlsFail
like due to it being self signed. In this case I think the local_error
is 304
.tlsFail
with an error code of 296
.Looking over the docs I can't find a reference for these codes anywhere but looking at the open ssl defined codes I think they correspond to ...
I'm not sure that theses are the right codes. They don't really make sense in this context.
I've verified that Ed25519 certs don't currently work. RSA and ECDSA certs work.
You can confirm with:
step certificate create localhost localhostrsa.crt localhostrsa.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty RSA
step certificate create localhost localhostec.crt localhostec.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty EC
step certificate create localhost localhosted.crt localhosted.key --profile self-signed --subtle --no-password --insecure --force --san 127.0.0.1 --san ::1 --not-after 31536000s --kty OKP
Only the Ed25519 cert fails. In particular the QUICClient
is timing out whereas the other certs connect fine. The reason there's a timeout is actually because there's an error sent back to the client that we are not handling. So we should fix that and make sure that the error is not just a timeout error, but a more specific error to handshake failure.
I checked on wireshark why it's failing.
Normally we send an initial packet, receive a retry packet, resend initial packet, then receive handshake.
What's happneing instead is that we send initial packet, receive retry packet, resend initial packet, and now the server responds with a initial packet with CONNECTION_CLOSE
frame with the error being CRYPTO_ERROR
.
The TLS Alert Description
is "Handshake Failure".
I suspect the boring library does not yet support Ed25519 certificates.
We also updated to the latest boring and quiche libraries, and this is the case too.
@tegefaulkes yes 296 is CRYPTO_ERROR
as you can see above. If you use wireshark and log out the keys with the logKeys
option, and configure wireguard as per #1 then you see the same thing.
Upstream issue: https://github.com/cloudflare/boring/issues/113.
Cloudflare's boring library itself is just a binding around the Google's boringssl library: https://github.com/google/boringssl
It's possible we may just need to enable or update the Rust's boring package somehow to include or enable ed25519 support in boringssl.
There is a bug in the failure condition of TLS verification. The verifyPeer
when switched on client or server results in a failure of the client connection and the server connection continuously attempting to send UDP packets back, possibly handshake frames. This has something to do with the way we are handling TLS failures, and the client side not expliciting sending a shutdown frame to the QUICServer
.
We need to have tests that test with verifyPeer(true)
.
This just means making use of the CA options in the config.
We can generate local certificates with a certificate authority. The easiest way is to put the local certificate used by the remote peer as a certificate authority.
So here are some possible tests:
Do both for client, server, and client & server.
These tests can be done for ECDSA certs or RSA certs... for now let's just use step-cli
to generate the certs and keys and save the files into the test fixtures.
@tegefaulkes do note I forgot to add quiche.verifyPeer(config.verifyPeer);
on my branch so make sure you have that in the buildQuicheConfig
function.
Also I'm not entirely sure if it matters whether the server or the client produces the keylog. I tried on both they seem to have the same effect on wireshark.
@tegefaulkes please set clientDefault.verifyPeer
to be true
and serverDefault.verifyPeer
to be false
. That's the defaults by quiche and we should preserve the defaults in our code.
Tests could have to be correspondingly updated though depending on whether you are testing TLS or not.
We have a response here https://github.com/cloudflare/quiche/issues/1482 for the failing ED25519
certs.
So we need to enable the ed25519
algo using the boring SslContextBuilder
's set_sigalgs_list.
The method takes a string of the algorithms delimited by :
, We just need to set ed25519
. But we want to support all of the algorithms. Annoyingly the options are not very well documented so I had to dig into the source,
This is what I found
// private key types
RSA
RSA-PSS
PSS
ECDSA
// hash types
SHA1
SHA256
SHA384
SHA512
// Combined like
RSA+SHA256
ECDSA+sha256
ed25519
// TLS 1.3-style names
rsa_pkcs1_md5_sha1
rsa_pkcs1_sha1
rsa_pkcs1_sha256
rsa_pkcs1_sha384
rsa_pkcs1_sha512
ecdsa_sha1
ecdsa_secp256r1_sha256
ecdsa_secp384r1_sha384
ecdsa_secp521r1_sha512
rsa_pss_rsae_sha256
rsa_pss_rsae_sha384
rsa_pss_rsae_sha512
ed25519
// All the available options
SSL_SIGN_RSA_PKCS1_SHA1
SSL_SIGN_RSA_PKCS1_SHA256
SSL_SIGN_RSA_PKCS1_SHA384
SSL_SIGN_RSA_PKCS1_SHA512
SSL_SIGN_RSA_PSS_RSAE_SHA256
SSL_SIGN_RSA_PSS_RSAE_SHA384
SSL_SIGN_RSA_PSS_RSAE_SHA512
SSL_SIGN_ECDSA_SHA1
SSL_SIGN_ECDSA_SECP256R1_SHA256
SSL_SIGN_ECDSA_SECP384R1_SHA384
SSL_SIGN_ECDSA_SECP521R1_SHA512
SSL_SIGN_ED25519
Now we just need to select out of this list the ones we want to support.
What is the default list if we didn't configure it at all?
Let's just copy the default list, and then just add on top the ed25519
.
Right now I'm taking what chrome supports and adding ed25519
.
"ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512"
IBM 7.5 only supports these, and Ed25519 only has 64 bytes and so has no further hash size specification.
We can follow this as the default.
Updated to "ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512"
I'm considering this fixed now.
Changes made
supportedPrivateKeyAlgos
to the config, this sets the supported algos and takes a string listing them. The default is currently "ed25519:RSA+SHA256:RSA+SHA384:RSA+SHA512:ECDSA+SHA256:ECDSA+SHA384:ECDSA+SHA512:RSA-PSS+SHA256:RSA-PSS+SHA384:RSA-PSS+SHA512"
which allows all available algos except ones using SHA1
hashing.fast-check
to select a tlsConfig
randomly from the example fixtures and the generated ed25519
signed cert chains.
Specification
We want the ability to generate the TLS cert chain and private key PEMs for testing. We should also create
fast-check
arbitraries for this.We need to replicate the
KeyPair
generation,x509
certificate creation and PEM format from thePolykey
methods. Since these will be used for testing, the types can be striped down to primitives and any extra information can be as placeholder as possible. Refer to thekeys
domain inPolykey
for how these are created.Additional context
Tasks
x509
certificates