internetstandards / Internet.nl

Internet standards compliance test suite
https://internet.nl
172 stars 35 forks source link

[WIP] Fix #714 - Switch to (mainly) sslyze for TLS testing #1218

Open mxsasha opened 9 months ago

mxsasha commented 9 months ago

Known changes/fixes

Specifics:

Reduced features

TODO

Fixes to sslyze:

Later:

Content updates:

Potential issues found in comparisons that are not yet known differences:

mxsasha commented 7 months ago

Summary:

Parallel mail server tests

Not immediately related but associated, I want to try parallelizing the mail server tests on individual different mail servers. The upside is faster results, especially useful now that the test per server is a bit slower. The downside is potential increased blocking from services that do not check the number of connections to an IP per mail server, but for the entire pool. But I can't say whether anyone does that.

SNI sending on mail server connections and DANE

Our current code only sends SNI on mailservers when DANE is present. Is that desired and intentional behavior? The new code always sends SNI, but it's not hard to change.

Public key DH

Old code / current new code

I am puzzled by some details of this. This concerns the validation of the certificate sent by the server, referring NCSC guidelines B3-3, B5-1, which in turn refer table 8 and 9. In my reading, the old code accepted RSA>=2048, DSA>=2048, DH>=2048 and SECP521R1/SECP384R1/SECP256R1/X25519/X448 >= 224, plus SECP224R1 as phase out.

Two questions:

Hash function for key exchange

We check a hash function for key exchange, for "SHA2 support for signatures", referring table 5. I can not figure out the context of this check - this is separate from the hash in the cipher, right?

baknu commented 7 months ago
* Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

* I can not find a reference in the TLS guidelines for acceptance of DH keys >= 2048. Am I overlooking it or should we not accept this?

For DHE we also check if finite field groups (RFC 7919) are used. The test explanation might also give some more insight: https://en.internet.nl/site/example.nl/2591039/#control-panel-14. This mentions guideline B5-1 and table 9 for ECDHE, and guideline B6-1 and table 10 for DHE. It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

mxsasha commented 7 months ago
* Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

* I can not find a reference in the TLS guidelines for acceptance of DH keys >= 2048. Am I overlooking it or should we not accept this?

For DHE we also check if finite field groups (RFC 7919) are used.

Currently we do not do this for the certificate, only for the key exchange. For public keys, the current implementation accepts any DH key >= 2048 bits.

This mentions guideline B5-1 and table 9 for ECDHE, and guideline B6-1 and table 10 for DHE.

My reading of this is also that this applies to any use of ECDHE/DHE, so the test should check DH public keys for specific finite field groups.

It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

I have not been able to locate code that currently does this, or what the requirements are that we have of the parameters.

baknu commented 7 months ago

@gthess Can you shed your light on this? Thanks!

gthess commented 7 months ago

The following is input based on my memory and a quick look over the (old) code. Both can be disputed :)

SNI sending on mail server connections and DANE

I believe the condition on the SNI is there because without DANE there is nothing to verify for a mail client, there is no certificate store for email; any certificate is good and is used for encryption. DANE changed that (other than DANE-EE) and requires name verification. So I believe internet.nl tries to behave like a non-DANE capable client when DANE is absent, and like a DANE capable client when DANE is present. I believe you can find corner cases for a testing platform with both behaviors so I don't think it ultimately matters if SNI is sent or not. What may matter is an extra check to see if DANE and non-DANE clients get different certificates (with different properties). But that is way out of scope for this question.

Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

For these curves yes. But I believe it has to do with the following text from the explanation "We check if the bit-length of the used elliptic curves is a least 224 bits. Currently we are not able to check the elliptic curve name." The size check may be a remnant from a previous version of the guidelines that I can no longer find; I don't see a size check in the current version. So the curve names are there for comparison but I don't think we were able to get them from nassl back then so the 'or' is excercised instead.

Currently we do not do this for the certificate, only for the key exchange. For public keys, the current implementation accepts any DH key >= 2048 bits. My reading of this is also that this applies to any use of ECDHE/DHE, so the test should check DH public keys for specific finite field groups.

The checking of finite fields is happening on the test "key exchange parameters" because that is what DH et al. is used for. Now it seems that DH gets a pass for size just to pass the public key test. Can't remember if that was deliberate or it used to work like that on the previous guideline version and we just left it there because of no harm. Maybe in the "Public key of certificate" test we can now say that DH et al. is already checked on the key exchange test. Otherwise checking the finite field on both tests would be confusing.

It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

I have not been able to locate code that currently does this, or what the requirements are that we have of the parameters.

I understand that this is the RSA check on the public key (table 8).

Hash function for key exchange I can not figure out the context of this check - this is separate from the hash in the cipher, right?

It is different indeed, this is negotiated as a Hello extension to designate the hash algorithm to use before signing for key exchange, when applicable.

mxsasha commented 6 months ago

sslyze will not integrate server cipher order preference checking. However, it's easier than we thought to do it ourselves. We only need to check that good>sufficient>phase out>bad. So once we know accepted ciphers and their goodness, we can make a connection with client preference for e.g. all sufficient ciphers, followed by all good ciphers. If the connection ends up using a good cipher, we are sure cipher order preference is enforced and at least one good is preferred over all sufficient. The same can be done for any tuple of cipher sets. Therefore, we can determine enough with just 3 connection attempts, much less than for a full cipher order.

Part of the needed API is visible in https://github.com/nabla-c0d3/sslyze/blob/release/sslyze/plugins/openssl_cipher_suites/_test_cipher_suite.py#L35 - so this shouldn't be too hard/complex to write ourselves.

gthess commented 6 months ago

Indeed, this is what the old code was doing, although it was weirdly mixed with the connection test to avoid extra connections; mostly made sense for the mail test.

mxsasha commented 6 months ago

SNI sending on mail server connections and DANE

I believe the condition on the SNI is there because without DANE there is nothing to verify for a mail client, there is no certificate store for email; any certificate is good and is used for encryption. DANE changed that (other than DANE-EE) and requires name verification. So I believe internet.nl tries to behave like a non-DANE capable client when DANE is absent, and like a DANE capable client when DANE is present. I believe you can find corner cases for a testing platform with both behaviors so I don't think it ultimately matters if SNI is sent or not. What may matter is an extra check to see if DANE and non-DANE clients get different certificates (with different properties). But that is way out of scope for this question.

It seems at least Exim and Postfix do not send SNI when DANE is absent (unless specifically configured to do so), so I think we should follow and only send SNI with present DANE. That means our testing stays close to real scenarios, and the existing code.

mxsasha commented 6 months ago

Suspected bug in existing implementation: cbc.badssl.com has good cipher order in production, but my implementation says it's bad for preferring TLS_RSA_WITH_AES_256_CBC_SHA256 (phase out) over TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (sufficient) - tested on TLS 1.2. SSL labs and testssl confirm my new order detection.

mxsasha commented 6 months ago

There is some complication regarding cipher orders and TLS versions. We currently test three things entirely separate from each other: TLS versions, ciphers enabled, and cipher order. So a server with "TLS_AES_256_GCM_SHA384:TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA" enforced order with TLS 1.3 and 1.2, will fail ciphers enabled, but succeed TLS versions and cipher order.

Also, we have decided that we will test cipher order using only the highest TLS version supported, ignoring TLS 1.3.

This creates a complexity for 6 cipher suites, which are listed under sufficient in NCSC appendix C, with the footnote:

These algorithm selections, combined with TLS 1.3 are Good. However, the (old) ciper suite notation used here will frequently result in the use of at most TLS 1.2 in software, which is Sufficient.

(note that the affected ciphers are also listed in the good section in their TLS 1.3 notation)

In the ciphers enabled code, I have currently counted those as "good", e.g. TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, because that test looks only at the ciphers, and does not care about TLS versions.

Question 1: is this the desired behaviour? This means TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS 1.2 is good on ciphers enabled, sufficient on TLS version. Alternative: score sufficient on both, which means only TLS 1.3 specific ciphers can be rated good in ciphers enabled.

If my current implementation is desired behaviour, it creates a complication in the cipher order. If we see TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as good, it means it should be preferred over e.g. TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, a "true" sufficient cipher. This is my current new implementation, and it fails on e.g. forumstandaardisatie MX (on TLS 1.2), as that prefers the CBC over the GCM cipher: the server prefers a "true" sufficient cipher over a "sufficient-kinda-but-only-because-it's-TLS-1.2" cipher. Current production succeeds, i.e. allows this.

Question 2: should a cipher order of TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 > TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 on TLS 1.2 lead to a failed cipher order test?

Pro: test should fail. In isolation, looking only at the cipher suite parameters, a sufficient (CBC) is preferred over a good (GCM). The fact that the TLS version would downgrade a real client connection to sufficient rating on either cipher, is out of scope as the test only concerns cipher order.

Con: test should succeed, as a client connecting on TLS 1.2 would receive a sufficient rating with either of these ciphers. Side effect: we would never compare good ciphers in ordering. In this interpretation, good ciphers are exclusive to TLS 1.3, and on TLS 1.2 any cipher can be sufficient at best. Suspected current production implementation.

The fundamental problem is: we separate ciphers enabled, cipher order and TLS version into 3 isolated and independent ratings. But a connection from a client has one rating in the end, which involves multiple of these factors. Then again, we could make the same argument for RSA key size, though isolating that is much more unambiguous.

My leaning: we should rate TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as good in the ciphers enabled test, as that test should be entirely isolated from the TLS version. We should allow (i.e. con section) the cipher order mentioned above as for real world use, either cipher provides an equal rating.

Note: this is unrelated to the suspected bug I identified in the previous comment: that concerned phase out vs sufficient ciphers. The ambiguity is only in 6 ciphers that NCSC lists as sufficient in the appendix but on their own merits are good.

gthess commented 6 months ago

If these are upgraded to good when combined with TLS1.3 and the cipher order test does not run on TLS1.3 (because everything is supposed to be good there), I would say treat them as sufficient for the cipher order test.

bwbroersma commented 3 months ago

It seems the code has different time-outs than the current code. After @nlitsme forced me with :beer: to add httpdelay= to b6a.nl, setting it to 10000 (10s) triggered expected 'not reachable' in internet.nl (test), while the current sslyze rewrite just finished without a not reachable.

bwbroersma commented 3 months ago

It seems the current PR handles tls_mail_smtp_starttls task different in terms of finishing and timeouts. It correctly finishes, but reports the TLS test failed as a timeout, while the current code continues in the back-end (while erroring the front-end) and can then provide a result with the /results url-hack. See https://github.com/internetstandards/Internet.nl/issues/1425#issuecomment-2143818855

~BTW I tried to run the sslyze branch, but got an error:~

worker-1     | ModuleNotFoundError: No module named 'sslyze'

~probably because the development is done on a special internetnl sslyze fork? Would be nice to really test both versions (on the same IP's).~ Update 2024-08-09: I probably did something wrong the last time, after a new git checkout sslyze, fixing the vendor directory, git submodule update --init, I did a make down, build, up and it all worked.

bwbroersma commented 1 month ago

@mxsasha Is it known that the current state of the implementation does not correctly detect ffdhe4096? On my servers I have:

# curl https://ssl-config.mozilla.org/ffdhe4096.txt | sudo tee /etc/ssl/dhparams/ffdhe4096.pem
ssl_dhparam /etc/ssl/dhparams/ffdhe4096.pem;

and

$ sha256sum /etc/ssl/dhparams/ffdhe4096.pem 
64852d6890ff9e62eecd1ee89c72af9af244dfef5b853bcedea3dfd7aade22b3  /etc/ssl/dhparams/ffdhe4096.pem

But in the current version of the sslyze version I see:

Key exchange parameters

Verdict: Your web server supports insufficiently secure parameters for Diffie-Hellman key exchange.

Technical details:
Web server IP address     Affected parameters     Status
...           DH-4096         insufficient

The current/old implementation finds this is okay, testssl.sh states:

TLSv1.2 (server order)
...
 x9e     TLS_DHE_RSA_WITH_AES_128_GCM_SHA256               DH 4096    AESGCM      128      DHE-RSA-AES128-GCM-SHA256          
 x9f     TLS_DHE_RSA_WITH_AES_256_GCM_SHA384               DH 4096    AESGCM      256      DHE-RSA-AES256-GCM-SHA384
...
 DH group offered:            ffdhe4096
mxsasha commented 1 month ago

@mxsasha Is it known that the current state of the implementation does not correctly detect ffdhe4096? On my servers I have:

Yes, this showed up in comparisons. Fixed but not deployed yet.

mxsasha commented 1 month ago

There is an issue with some certificate chains, which appears in ipcei-hydrogen.eu. The trust validation fails on CandidatesExhausted(Other("authorityKeyIdentifier must not contain authorityCertIssuer")), an error that originates in cryptography here.

Browsers trust it, and so does ssllabs and our old test. Potentially relevant is that the certificate chain contains the root and is out of order, and the root is SHA1:

openssl s_client ipcei-hydrogen.eu:443
Connecting to 2a04:9a00:1002:400d:f816:3eff:fe83:9a2f
CONNECTED(00000006)
depth=2 C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
verify return:1
depth=1 C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
verify return:1
depth=0 C=NL, ST=Zuid-Holland, L='s-Gravenhage, O=Rijksdienst voor Ondernemend Nederland, CN=ipcei-hydrogen.eu
verify return:1
---
Certificate chain
 0 s:C=NL, ST=Zuid-Holland, L='s-Gravenhage, O=Rijksdienst voor Ondernemend Nederland, CN=ipcei-hydrogen.eu
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
   a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
   v:NotBefore: Jul  8 06:09:03 2024 GMT; NotAfter: Jul  8 06:03:00 2025 GMT
 1 s:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA1
   v:NotBefore: Nov 24 18:27:00 2006 GMT; NotAfter: Nov 24 18:23:33 2031 GMT
 2 s:C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
   v:NotBefore: May 25 19:47:39 2022 GMT; NotAfter: May 24 19:47:38 2027 GMT
---
mxsasha commented 1 month ago

Following up on the last comment: this is unrelated to ordering, including the root certificate, or SHA1. Best guess so far: a bug in the detection of the authorityCertIssuer extension in the cryptography python package.

The issue was introduced in this sslyze commit, included from 6.0.0 onwards, which switches to using py-cryptography's certificate validation.

I made a gist that isolates the issue. This is closely based on, but does not call, the sslyze code for this. My test case uses the leaf certificate and intermediate certificate only - the superfluous and out of order root certificate is removed. This will trigger the "authorityKeyIdentifier must not contain authorityCertIssuer" error, referring an extension defined in RFC5280 section 4.2.1.1.

That error originates in cryptography, based on the CABF baseline which states authorityCertIssuer MUST NOT be present. However, as far as I can find, neither of these certificates contains authorityCertIssuer. Both using local openssl and report-uri's PEM decoder do not see it. It also is unambiguously not permitted by CABF baseline, so it would be very strange for a certificate to suddenly show up that has it.

If the certificate indeed doesn't contain authorityCertIssuer, then the only explanation is py-cryptography detecting that extension in error. Going through the code did not reveal anything obvious - it has to be quite rare or it would have been noticed already. Or, I am missing something about x509.

Update: this was addressed in https://github.com/pyca/cryptography/issues/11461