[WIP] Fix #714 - Switch to (mainly) sslyze for TLS testing

mxsasha commented 9 months ago

Known changes/fixes

Server cipher order preference detects issues that were missed by the old code.
Sometimes some ciphers are not detected by the old code.

Specifics:

cbc.badssl.com web has good cipher order in the old code, but new implementation says it's bad for preferring TLS_RSA_WITH_AES_256_CBC_SHA256 (phase out) over TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (sufficient) - SSL labs and testssl confirm my new order detection.
~~sidn.nl mx has TLS_ECDHE_RSA_WITH_CAMELLIA_256_CBC_SHA* which is insufficient and not detected by the old code~~
sidn.nl mx prefers TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 over TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 which sufficient>good, not detected in old code.
github.com offers both ECDSA and RSA certificate and this causes two glitches in RSA authenticated ciphers that are not detected by testssl and ssllabs, nor the old code which counts it as good. First, it has TLS_RSA_WITH_AES_256_GCM_SHA384 amongst others which is phase out. Second, it prefers TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256 over TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 which is bad as AES-128-CBC is sufficient only in NCSC guidelines.
ad.nl has TLS_AES_128_CCM_8_SHA256, old test does not detect this
old test appears to sometimes miss OCSP stapling, but sees it on retry

Reduced features

ECDHE-ECDSA-CHACHA20-POLY1305-OLD and ECDHE-RSA-CHACHA20-POLY1305-OLD are not detected by the new test, along with possible other -OLD ciphers. Seen in milieucentraal.nl as example. My openssl does not support that algo at all, sslyze does not detect it, ssllabs and testssl do. If we strongly care about this, we need to get them added to sslyze.
We no longer distinguish "your server has no cipher order preference" from "your server has a cipher order preference, but it violates the requirements". Intentional choice, as it has little value and requires an extra connection.

TODO

[x] ECDH-X25519 is falsely counted as insufficient
[x] Make sure we catch sslyze error states
[x] Get the logging framework to work and make useful logs
[x] We're not using the internetnl trust store
[x] Trust isn't actually checked
[x] Some mix of web/mail scorings in cert_checks() and check_pubkey()
[x] check_pubkey() might need some correction in DH checking
[x] Mail server cache currently disabled
[x] Only one mailserver tested right now
[x] check_mail_tls() used to depend SNI on whether DANE was present??? Look into that.
[x] Check whether we're handling connection issues correctly
[x] OCSP is probably not using the right trust store
[x] Check if DANE validation actually works
[x] Re-enable OCSP check - might be in certificate info scan?
[x] Re-enable our trust store
[x] Key exchange hash function (kex_hash_func) doesn't seem supported by sslyze or has a different name or I'm overlooking it
[x] Hostname matching may be wrong for mail - and check score penalty this is partually handled a layer up
[x] Is SoftTimeLimitExceeded ever raised?
[x] Parallelise different mail server tests?
[ ] Review/refactoring
[x] Duplicate ciphers in cipher list
[x] Duplicate kex parameters
[x] On cbc.badssl.com, Cipher order fails with first pair "None"
[x] TLS versions not ordered
[x] Look into SSL_OP_PRIORITIZE_CHACHA (#461)
[ ] Clean out print statements
[x] Add CAMELLIA/SEED which are missing from appendix C
[x] Clean up location of max renegotiation attempts in sslyze patch
[x] Wait for 220 greeting to be finished in starttls

Fixes to sslyze:

[x] Set proper SMTP HELO hostname PR
[x] Return more detail on HELO rejected PR
[x] Set custom user agent on HTTP requests
~~Server cipher order is unsupported by sslyze~~

Later:

[ ] Document any known differences
[x] Validate against NCSC guidelines, especially the footnotes
[x] Add normative references
[ ] Remove OpenSSL names from cipher names
[x] Fix up the tests
[ ] Run comparison
[ ] Update to latest cryptography for relaxed AKI checks for correct certificate trust
[ ] Try to migrate to non-custom sslyze and nassl, or at least make sure patched versions are properly included.

Content updates:

[ ] We no longer use the difference between "cipher order wrong" and "cipher order not enforced", the only verdict labels are now bad/good/na
[ ] We use IANA cipher names now.
[ ] Update labels for cipher order tech table (IP address, Preferred by server, Expected preference). This was not noticeable before due to a rendering bug.

Potential issues found in comparisons that are not yet known differences:

~~milieucentraal.nl old test detects ECDHE-ECDSA-CHACHA20-POLY1305-OLD and ECDHE-RSA-CHACHA20-POLY1305-OLD, new test does not~~
~~ricomoda.com old test detects FFDHE-2048, new test does not, minocw.nl old test detects DH-4096, new test does not - old test confirmed correct by testssl and ssllabs~~
~~airbnb.nl old test detects DH-2048, new test does not - testssl unclear, ssllabs agrees with old test~~
amstelveen.nl new test claims TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 preferred over TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256, but ssllabs and testssl do not agree with new test. Logs show new test asks TLS_DHE_RSA_WITH_AES_256_GCM_SHA384 / TLS_DHE_RSA_WITH_AES_128_GCM_SHA256/ TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305_SHA256 and then TLS_DHE_RSA_WITH_AES_128_GCM_SHA256 is selected, which seems like correct detection. Openssl DHE-RSA-AES256-GCM-SHA384:DHE-RSA-AES128-GCM-SHA256:ECDHE-RSA-CHACHA20-POLY1305 on TLS 1.2 picks the CHACHA20, so this looks like a bug in the new code.
~~www.adviescollegeveiligheidgroningen.nl fails key exchange on DH-4096 on new test, not on old test~~
~~document-viewer.dso.kadaster.nl and www.mmdc.nl is good on client-initiated renegotiation in new test, bad in old test~~ moved to #1483
~~www.vwsdia-archief.nl key exchange is DH-2048 / insufficient in new test, FFDHE-2048 / phase out in old test~~
~~ipcei-hydrogen.eu and stelselcatalogus.omgevingswet.overheid.nl has bad trust chain in new test - browser is happy with it, so probable new test bug~~
~~www.nationaalgeoregister.nl and www.sportakkoord.nl key exchange is DH-4096 / insufficient in new test, good in old test~~

mxsasha commented 7 months ago

Summary:

Anything not noted in here should work.
There are a number of unclear things after evaluating the old code and the TLS guidelines, noted below.
I've tried to extract lists of ciphers etc from the code in the new branch to make it easier to understand and update our policies.
Server cipher order is entirely unsupported and currently always OK.
Mail server tests through sslyze are somewhat slower.
Cipher list currently outputs full IANA names too to aid in debugging.
I've not run a full check of the code against the TLS guidelines, so there are probably unidentified differences with the new code. It is also likely that the old code has bugs fixed by this PR, and that the new code has new, different bugs.
Something is off with certificate hostname matching for mail.
Looking for input on the points below.

Parallel mail server tests

Not immediately related but associated, I want to try parallelizing the mail server tests on individual different mail servers. The upside is faster results, especially useful now that the test per server is a bit slower. The downside is potential increased blocking from services that do not check the number of connections to an IP per mail server, but for the entire pool. But I can't say whether anyone does that.

SNI sending on mail server connections and DANE

Our current code only sends SNI on mailservers when DANE is present. Is that desired and intentional behavior? The new code always sends SNI, but it's not hard to change.

Public key DH

Old code / current new code

I am puzzled by some details of this. This concerns the validation of the certificate sent by the server, referring NCSC guidelines B3-3, B5-1, which in turn refer table 8 and 9. In my reading, the old code accepted RSA>=2048, DSA>=2048, DH>=2048 and SECP521R1/SECP384R1/SECP256R1/X25519/X448 >= 224, plus SECP224R1 as phase out.

Two questions:

Isn't the key size check for named curves superfluous because all these curves have a fixed key size?
I can not find a reference in the TLS guidelines for acceptance of DH keys >= 2048. Am I overlooking it or should we not accept this?

Hash function for key exchange

We check a hash function for key exchange, for "SHA2 support for signatures", referring table 5. I can not figure out the context of this check - this is separate from the hash in the cipher, right?

baknu commented 7 months ago

* Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

* I can not find a reference in the TLS guidelines for acceptance of DH keys >= 2048. Am I overlooking it or should we not accept this?

For DHE we also check if finite field groups (RFC 7919) are used. The test explanation might also give some more insight: https://en.internet.nl/site/example.nl/2591039/#control-panel-14. This mentions guideline B5-1 and table 9 for ECDHE, and guideline B6-1 and table 10 for DHE. It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

mxsasha commented 7 months ago

* Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

* I can not find a reference in the TLS guidelines for acceptance of DH keys >= 2048. Am I overlooking it or should we not accept this?

For DHE we also check if finite field groups (RFC 7919) are used.

Currently we do not do this for the certificate, only for the key exchange. For public keys, the current implementation accepts any DH key >= 2048 bits.

This mentions guideline B5-1 and table 9 for ECDHE, and guideline B6-1 and table 10 for DHE.

My reading of this is also that this applies to any use of ECDHE/DHE, so the test should check DH public keys for specific finite field groups.

It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

I have not been able to locate code that currently does this, or what the requirements are that we have of the parameters.

baknu commented 7 months ago

@gthess Can you shed your light on this? Thanks!

gthess commented 7 months ago

The following is input based on my memory and a quick look over the (old) code. Both can be disputed :)

SNI sending on mail server connections and DANE

I believe the condition on the SNI is there because without DANE there is nothing to verify for a mail client, there is no certificate store for email; any certificate is good and is used for encryption. DANE changed that (other than DANE-EE) and requires name verification. So I believe internet.nl tries to behave like a non-DANE capable client when DANE is absent, and like a DANE capable client when DANE is present. I believe you can find corner cases for a testing platform with both behaviors so I don't think it ultimately matters if SNI is sent or not. What may matter is an extra check to see if DANE and non-DANE clients get different certificates (with different properties). But that is way out of scope for this question.

Isn't the key size check for named curves superfluous because all these curves have a fixed key size?

For these curves yes. But I believe it has to do with the following text from the explanation "We check if the bit-length of the used elliptic curves is a least 224 bits. Currently we are not able to check the elliptic curve name." The size check may be a remnant from a previous version of the guidelines that I can no longer find; I don't see a size check in the current version. So the curve names are there for comparison but I don't think we were able to get them from nassl back then so the 'or' is excercised instead.

Currently we do not do this for the certificate, only for the key exchange. For public keys, the current implementation accepts any DH key >= 2048 bits. My reading of this is also that this applies to any use of ECDHE/DHE, so the test should check DH public keys for specific finite field groups.

The checking of finite fields is happening on the test "key exchange parameters" because that is what DH et al. is used for. Now it seems that DH gets a pass for size just to pass the public key test. Can't remember if that was deliberate or it used to work like that on the previous guideline version and we just left it there because of no harm. Maybe in the "Public key of certificate" test we can now say that DH et al. is already checked on the key exchange test. Otherwise checking the finite field on both tests would be confusing.

It also states that the RSA public parameters are tested in the subtest 'Public key of certificate'.

I have not been able to locate code that currently does this, or what the requirements are that we have of the parameters.

I understand that this is the RSA check on the public key (table 8).

Hash function for key exchange I can not figure out the context of this check - this is separate from the hash in the cipher, right?

It is different indeed, this is negotiated as a Hello extension to designate the hash algorithm to use before signing for key exchange, when applicable.

mxsasha commented 6 months ago

sslyze will not integrate server cipher order preference checking. However, it's easier than we thought to do it ourselves. We only need to check that good>sufficient>phase out>bad. So once we know accepted ciphers and their goodness, we can make a connection with client preference for e.g. all sufficient ciphers, followed by all good ciphers. If the connection ends up using a good cipher, we are sure cipher order preference is enforced and at least one good is preferred over all sufficient. The same can be done for any tuple of cipher sets. Therefore, we can determine enough with just 3 connection attempts, much less than for a full cipher order.

Part of the needed API is visible in https://github.com/nabla-c0d3/sslyze/blob/release/sslyze/plugins/openssl_cipher_suites/_test_cipher_suite.py#L35 - so this shouldn't be too hard/complex to write ourselves.

gthess commented 6 months ago

Indeed, this is what the old code was doing, although it was weirdly mixed with the connection test to avoid extra connections; mostly made sense for the mail test.

mxsasha commented 6 months ago

SNI sending on mail server connections and DANE

I believe the condition on the SNI is there because without DANE there is nothing to verify for a mail client, there is no certificate store for email; any certificate is good and is used for encryption. DANE changed that (other than DANE-EE) and requires name verification. So I believe internet.nl tries to behave like a non-DANE capable client when DANE is absent, and like a DANE capable client when DANE is present. I believe you can find corner cases for a testing platform with both behaviors so I don't think it ultimately matters if SNI is sent or not. What may matter is an extra check to see if DANE and non-DANE clients get different certificates (with different properties). But that is way out of scope for this question.

It seems at least Exim and Postfix do not send SNI when DANE is absent (unless specifically configured to do so), so I think we should follow and only send SNI with present DANE. That means our testing stays close to real scenarios, and the existing code.

mxsasha commented 6 months ago

Suspected bug in existing implementation: cbc.badssl.com has good cipher order in production, but my implementation says it's bad for preferring TLS_RSA_WITH_AES_256_CBC_SHA256 (phase out) over TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA (sufficient) - tested on TLS 1.2. SSL labs and testssl confirm my new order detection.

mxsasha commented 6 months ago

There is some complication regarding cipher orders and TLS versions. We currently test three things entirely separate from each other: TLS versions, ciphers enabled, and cipher order. So a server with "TLS_AES_256_GCM_SHA384:TLS_DHE_RSA_WITH_3DES_EDE_CBC_SHA" enforced order with TLS 1.3 and 1.2, will fail ciphers enabled, but succeed TLS versions and cipher order.

Also, we have decided that we will test cipher order using only the highest TLS version supported, ignoring TLS 1.3.

This creates a complexity for 6 cipher suites, which are listed under sufficient in NCSC appendix C, with the footnote:

These algorithm selections, combined with TLS 1.3 are Good. However, the (old) ciper suite notation used here will frequently result in the use of at most TLS 1.2 in software, which is Sufficient.

(note that the affected ciphers are also listed in the good section in their TLS 1.3 notation)

In the ciphers enabled code, I have currently counted those as "good", e.g. TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256, because that test looks only at the ciphers, and does not care about TLS versions.

Question 1: is this the desired behaviour? This means TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 TLS 1.2 is good on ciphers enabled, sufficient on TLS version. Alternative: score sufficient on both, which means only TLS 1.3 specific ciphers can be rated good in ciphers enabled.

If my current implementation is desired behaviour, it creates a complication in the cipher order. If we see TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as good, it means it should be preferred over e.g. TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384, a "true" sufficient cipher. This is my current new implementation, and it fails on e.g. forumstandaardisatie MX (on TLS 1.2), as that prefers the CBC over the GCM cipher: the server prefers a "true" sufficient cipher over a "sufficient-kinda-but-only-because-it's-TLS-1.2" cipher. Current production succeeds, i.e. allows this.

Question 2: should a cipher order of TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384 > TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 on TLS 1.2 lead to a failed cipher order test?

Pro: test should fail. In isolation, looking only at the cipher suite parameters, a sufficient (CBC) is preferred over a good (GCM). The fact that the TLS version would downgrade a real client connection to sufficient rating on either cipher, is out of scope as the test only concerns cipher order.

Con: test should succeed, as a client connecting on TLS 1.2 would receive a sufficient rating with either of these ciphers. Side effect: we would never compare good ciphers in ordering. In this interpretation, good ciphers are exclusive to TLS 1.3, and on TLS 1.2 any cipher can be sufficient at best. Suspected current production implementation.

The fundamental problem is: we separate ciphers enabled, cipher order and TLS version into 3 isolated and independent ratings. But a connection from a client has one rating in the end, which involves multiple of these factors. Then again, we could make the same argument for RSA key size, though isolating that is much more unambiguous.

My leaning: we should rate TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256 as good in the ciphers enabled test, as that test should be entirely isolated from the TLS version. We should allow (i.e. con section) the cipher order mentioned above as for real world use, either cipher provides an equal rating.

Note: this is unrelated to the suspected bug I identified in the previous comment: that concerned phase out vs sufficient ciphers. The ambiguity is only in 6 ciphers that NCSC lists as sufficient in the appendix but on their own merits are good.

gthess commented 6 months ago

If these are upgraded to good when combined with TLS1.3 and the cipher order test does not run on TLS1.3 (because everything is supposed to be good there), I would say treat them as sufficient for the cipher order test.

bwbroersma commented 3 months ago

It seems the code has different time-outs than the current code. After @nlitsme forced me with :beer: to add httpdelay= to b6a.nl, setting it to 10000 (10s) triggered expected 'not reachable' in internet.nl (test), while the current sslyze rewrite just finished without a not reachable.

bwbroersma commented 3 months ago

It seems the current PR handles tls_mail_smtp_starttls task different in terms of finishing and timeouts. It correctly finishes, but reports the TLS test failed as a timeout, while the current code continues in the back-end (while erroring the front-end) and can then provide a result with the /results url-hack. See https://github.com/internetstandards/Internet.nl/issues/1425#issuecomment-2143818855

~BTW I tried to run the sslyze branch, but got an error:~

worker-1     | ModuleNotFoundError: No module named 'sslyze'

~probably because the development is done on a special internetnl sslyze fork? Would be nice to really test both versions (on the same IP's).~ Update 2024-08-09: I probably did something wrong the last time, after a new git checkout sslyze, fixing the vendor directory, git submodule update --init, I did a make down, build, up and it all worked.

bwbroersma commented 1 month ago

@mxsasha Is it known that the current state of the implementation does not correctly detect ffdhe4096? On my servers I have:

# curl https://ssl-config.mozilla.org/ffdhe4096.txt | sudo tee /etc/ssl/dhparams/ffdhe4096.pem
ssl_dhparam /etc/ssl/dhparams/ffdhe4096.pem;

and

$ sha256sum /etc/ssl/dhparams/ffdhe4096.pem 
64852d6890ff9e62eecd1ee89c72af9af244dfef5b853bcedea3dfd7aade22b3  /etc/ssl/dhparams/ffdhe4096.pem

But in the current version of the sslyze version I see:

Key exchange parameters

Verdict: Your web server supports insufficiently secure parameters for Diffie-Hellman key exchange.

Technical details:
Web server IP address     Affected parameters     Status
...           DH-4096         insufficient

The current/old implementation finds this is okay, testssl.sh states:

TLSv1.2 (server order)
...
 x9e     TLS_DHE_RSA_WITH_AES_128_GCM_SHA256               DH 4096    AESGCM      128      DHE-RSA-AES128-GCM-SHA256          
 x9f     TLS_DHE_RSA_WITH_AES_256_GCM_SHA384               DH 4096    AESGCM      256      DHE-RSA-AES256-GCM-SHA384
...
 DH group offered:            ffdhe4096

SSL Labs cannot be used since https://github.com/ssllabs/ssllabs-scan/issues/524

mxsasha commented 1 month ago

@mxsasha Is it known that the current state of the implementation does not correctly detect ffdhe4096? On my servers I have:

Yes, this showed up in comparisons. Fixed but not deployed yet.

mxsasha commented 1 month ago

There is an issue with some certificate chains, which appears in ipcei-hydrogen.eu. The trust validation fails on CandidatesExhausted(Other("authorityKeyIdentifier must not contain authorityCertIssuer")), an error that originates in cryptography here.

Browsers trust it, and so does ssllabs and our old test. Potentially relevant is that the certificate chain contains the root and is out of order, and the root is SHA1:

openssl s_client ipcei-hydrogen.eu:443
Connecting to 2a04:9a00:1002:400d:f816:3eff:fe83:9a2f
CONNECTED(00000006)
depth=2 C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
verify return:1
depth=1 C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
verify return:1
depth=0 C=NL, ST=Zuid-Holland, L='s-Gravenhage, O=Rijksdienst voor Ondernemend Nederland, CN=ipcei-hydrogen.eu
verify return:1
---
Certificate chain
 0 s:C=NL, ST=Zuid-Holland, L='s-Gravenhage, O=Rijksdienst voor Ondernemend Nederland, CN=ipcei-hydrogen.eu
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
   a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
   v:NotBefore: Jul  8 06:09:03 2024 GMT; NotAfter: Jul  8 06:03:00 2025 GMT
 1 s:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   a:PKEY: rsaEncryption, 4096 (bit); sigalg: RSA-SHA1
   v:NotBefore: Nov 24 18:27:00 2006 GMT; NotAfter: Nov 24 18:23:33 2031 GMT
 2 s:C=BM, O=QuoVadis Limited, CN=QuoVadis Global SSL ICA G2
   i:C=BM, O=QuoVadis Limited, CN=QuoVadis Root CA 2
   a:PKEY: rsaEncryption, 2048 (bit); sigalg: RSA-SHA256
   v:NotBefore: May 25 19:47:39 2022 GMT; NotAfter: May 24 19:47:38 2027 GMT
---

mxsasha commented 1 month ago

Following up on the last comment: this is unrelated to ordering, including the root certificate, or SHA1. Best guess so far: a bug in the detection of the authorityCertIssuer extension in the cryptography python package.

The issue was introduced in this sslyze commit, included from 6.0.0 onwards, which switches to using py-cryptography's certificate validation.

I made a gist that isolates the issue. This is closely based on, but does not call, the sslyze code for this. My test case uses the leaf certificate and intermediate certificate only - the superfluous and out of order root certificate is removed. This will trigger the "authorityKeyIdentifier must not contain authorityCertIssuer" error, referring an extension defined in RFC5280 section 4.2.1.1.

That error originates in cryptography, based on the CABF baseline which states authorityCertIssuer MUST NOT be present. However, as far as I can find, neither of these certificates contains authorityCertIssuer. Both using local openssl and report-uri's PEM decoder do not see it. It also is unambiguously not permitted by CABF baseline, so it would be very strange for a certificate to suddenly show up that has it.

If the certificate indeed doesn't contain authorityCertIssuer, then the only explanation is py-cryptography detecting that extension in error. Going through the code did not reveal anything obvious - it has to be quite rare or it would have been noticed already. Or, I am missing something about x509.

Update: this was addressed in https://github.com/pyca/cryptography/issues/11461

internetstandards / Internet.nl