Improve (na)ssl dependency situation

mxsasha commented 2 years ago

Our TLS tests currently depend on our own fork of nassl (the internetnl branch). This fork is based on a 4 year old version of nassl, built on a fork of an older openssl version. The build process is fragile, and this is not a sustainable setup. We have already talked/mailed about this - this issue is meant to summarize plans and thoughts about improving this.

The custom nassl fork adds support for TLS 1.3, ChaCha20 and Poly1305. These have since been added to nassl itself, so we likely do not need our own fork anymore. However, there are also API differences, which may be APIs we added to our own fork, and/or changes in nassl in the last years.

In addition to nassl dependencies, our TLS checks are a few thousand lines of our own code. From what I’ve seen so far, a lot of this code also has a high complexity. Together with the vague dependencies and their build process, this makes TLS one of the more complex and fragile parts of the project.

Built on top of nassl is sslyze, an SSL/TLS scanning tool by the same author. I played around with it a bit, and I am impressed by how thorough it is, and the interface is really neat. Potentially, we could replace a lot of our code with calls to sslyze instead. It is currently not feature complete enough for us, but perhaps we could build some features and contribute them back. It does make judgements about the results, but can report enough raw data that we can still apply our own rules and scoring. This is an example JSON output to show the kind of checks and detail.

@gthess contributed this list of probable requirements:

Are all the current TLS checks that we support covered by the library?
Is it possible to pass IP addresses for testing instead of domain names?
Can you run individual checks against a server or you call a "check_server" function that runs the whole thing? If the latter is the case, can checks be conditionally turned on/off for a domain? For example to limit connections/testing time when testing a mail server for checks that are not relevant for mail servers e.g., OCSP-stapling.
Can we influence how connections are made to a server during testing (consecutive/parallel). Especially for email we want the connections to be done in series.
In what degree can we influence the checks themselves? For example if we only care for non good ciphers we don't want the test to waste connections (and possibly hitting connection limits) on testing ciphers that we don't want to take any action about. Another example is to stop the cipher test on the first offending cipher; no need to waste connections if we know the verdict already.
Can it handle cipher order preference testing?
Some tests are HTTP based (HTTPS available/HTTPS redirect/HTTP compression/HSTS/HTTP header tests). For these we use the clients that nassl provides for initiating the TLS connection. We would still need to use these for connecting via HTTPS (we can't rely on python's requests for example in case the TLS connection is considered insecure/old for the system's cryptolibraries) This bullet point could be up for discussion to fail those HTTP tests if a TLS connection cannot be made by a modern system.

Personally I do not think the last bullet should be a hard requirement. If your TLS setup is so exceptionally broken that an ordinary Python client can't connect anymore, I feel fine with reporting that we skipped some tests - because whatever those tests would report, it isn't remotely the main issue that site is having.

Also important to consider some configuration cases like #753.

Currently sslyze does not support cipher order, but this is planned, so that is one clear missing feature.

I see a few potential paths:

Upgrade our nassl dependency to the latest nassl version. Unclear how difficult this is, it might just be changing/adding a few API calls. If we need API calls that are missing from nassl now, try to integrate them in a way that we can contribute them upstream. We should really try to minimize maintaining custom forks when possible.
Identify in more detail which sslyze features are missing to have a better idea of how feasible it is to use sslyze as our TLS test backend, or what we additional features we would need to ask for or add ourselves. Cipher order is a definite one, there may be more. We could also see if we can fund nassl to add these features. This has a lot of potential, because it means we have much less custom code, and contribute to the wider TLS testing ecosystem.
Another TLS testing library. I'm not aware of any others with this much functionality though.

Some mix may also be a good choice. It's probably worth exploring all options in some more detail.

(Note that our nassl fork does have a "how to keep the fork up to date" section in the readme, but this only updates the main branch, not our custom branch).

mxsasha commented 1 year ago

Plan: explore a number of libraries and see how they match up with our requirements, probably do some testing on good and bad setups to see how they handle. At least consider: sslscan, sslyze, testssl.sh, cryptonice, tls-scanner.

baknu commented 1 year ago

https://github.com/Pieter1024/tls-guidelines-scan

mxsasha commented 1 year ago

Cryptography 39.0.0 is broken for us, possibly because they have dropped OpenSSL 1.1.0 support. The error is weird, but we've seen weird ways for errors to surface before:

  File "/opt/internetnl/Internet.nl/interface/management/commands/probe.py", line 6, in <module>
    from checks.tasks import ipv6, dnssec, mail, shared, appsecpriv, tls, rpki
  File "/opt/internetnl/Internet.nl/checks/tasks/tls.py", line 13, in <module>
    import eventlet
  File "/opt/internetnl/Internet.nl/.venv/lib/python3.7/site-packages/eventlet/__init__.py", line 17, in <module>
    from eventlet import convenience
  File "/opt/internetnl/Internet.nl/.venv/lib/python3.7/site-packages/eventlet/convenience.py", line 7, in <module>
    from eventlet.green import socket
  File "/opt/internetnl/Internet.nl/.venv/lib/python3.7/site-packages/eventlet/green/socket.py", line 4, in <module>
    __import__('eventlet.green._socket_nodns')
  File "/opt/internetnl/Internet.nl/.venv/lib/python3.7/site-packages/eventlet/green/_socket_nodns.py", line 7, in <module>
    eventlet.patcher.slurp_properties(__socket, globals(), ignore=__patched__, srckeys=dir(__socket))
AttributeError: module 'eventlet' has no attribute 'patcher'

38.0.4 definitely works, and we do use OpenSSL 1.1.0, so that is my best guess. #872 pins us to <39.0.0 for now.

mxsasha commented 1 year ago

CryptoNice

Python. Built on top of SSLyze and a few other tools. Maintained by F5, last commit August 2021. A bit fragile, some docs are outdated and the requirements are incorrect as it does not run with the latest version of sslyze. JSON output is a lot less than sslyze, and very likely insufficient. Not well suited to our needs.

tls-scanner

Java. Last commit July 2022. Could not get it running reliably.

testssl.sh

Bash. Last commit February 2023. Supports STARTTLS. May not support testing on IP while sending SNI hostname, but can connect on IP. Allows testing individual ciphers, giving us opportunities to shortcut. Can be slow, even when limiting tests to the important ones for us, when many TLS versions are supported. Does detect cipher order. Detects SSLv3 but does not seem to detect SSLv2. JSON is ok but not great. Some scan results.

sslscan

C. Last commit January 2023. Supports STARTTLS. Seems to support testing by IP and sending a hostname on SNI. Detects SSLv2/3, but not the ciphers on there. Does detect cipher order. XML seems a bit shallow. Some scan results.

sslyze

Python. Last commit January 18. Detects SSLv2/v3 and the ciphers. Supports testing on IP with SNI. Does not detect cipher order. Great JSON output.

Conclusion

The best candidates for now are testssl and sslyze, and those should get a deeper dive to see how close they can get to what we need and whether adopting one of them is a good plan forwarc. Sslyze definitely misses cipher order, but perhaps we can get that added. It does have the advantage of being in a known language for us. Testssl may be more functionally complete, but there is less chance for us to patch it if needed.

mxsasha commented 1 year ago

Did more testing specifically looking at our current tests:

TLS

test	testssl	sslyze
TLS versions (SSL 3 through TLS 1.3)	✔️	✔️
Ciphers enabled on server	✔️	✔️
Cipher order preference on server	✔️	❌
(EC)DHE key exchange parameters	✔️	✔️
Key exchange hash functions	✔️	✔️
TLS compression enabled	✔️	✔️
Insecure renegotiation	❌	✔️
Client-initiated renegotiation	✔️	✔️
0-RTT	❌	✔️	✔️
OCSP stapling	✔️	✔️

For OCSP stapling, check how deeply we verify currently.

Certificate

test	testssl	sslyze
Trust chain	✔️	✔️
ECDSA/RSA parameters	✔️	✔️
Hash algorithm	✔️	✔️
Domain name match	✔️	✔️

Running the test

test	testssl	sslyze
IPv6	✔️	✔️
STARTTLS	✔️	✔️
Passing IP instead of hostname	✔️	✔️
Serial/parallel connections and/or connection timing	Serial, timing adjustable	slow_connection option
Limit which ciphers we test for	✔️	❌
Terminate test on first violation	❌	❌
Speed of test	kinda slow	great without slow_connection
Ease of integration	ok, subprocess call + json	great, python API
Language known	❌	✔️

Tested as:

testssl.sh -6 --nodns min --ids-friendly --protocols --preference --server_defaults --renegotiation --crime --mapping iana --show-each --json-pretty mozilla-old.badssl.com (26s)
testssl.sh --starttls smtp --nodns min --ids-friendly --fs --protocols --preference --server_defaults --renegotiation --crime --mapping iana --show-each --json-pretty internet.nl:25 (14s)
testssl.sh --nodns min --ids-friendly --fs --protocols --preference --server_defaults --renegotiation --crime --mapping iana --show-each --json-pretty internet.nl (14s)
sslyze --mozilla_config=disable --slow_connection --sslv2 --sslv3 --tlsv1 --tlsv1_1 --tlsv1_2 --tlsv1_3 --elliptic_curves --certinfo --compression --reneg --early_data mozilla-old.badssl.com (64s)
sslyze --mozilla_config=disable --sslv2 --sslv3 --tlsv1 --tlsv1_1 --tlsv1_2 --tlsv1_3 --elliptic_curves --certinfo --compression --reneg --early_data mozilla-old.badssl.com (26s)
sslyze --slow_connection --mozilla_config=disable --sslv2 --sslv3 --tlsv1 --tlsv1_1 --tlsv1_2 --tlsv1_3 --elliptic_curves --certinfo --compression --reneg --early_data --starttls smtp internet.nl:25 (23s)
sslyze --mozilla_config=disable --sslv2 --sslv3 --tlsv1 --tlsv1_1 --tlsv1_2 --tlsv1_3 --elliptic_curves --certinfo --compression --reneg --early_data internet.nl (2.8s)
sslyze --mozilla_config=disable --slow_connection --sslv2 --sslv3 --tlsv1 --tlsv1_1 --tlsv1_2 --tlsv1_3 --elliptic_curves --certinfo --compression --reneg --early_data internet.nl (4s)

mxsasha commented 1 year ago

Our current nassl fork is also not compatible with Python 3.10 and newer:

  File "internet.nl/checks/tasks/tls_connection.py", line 267, in connect
    self.do_handshake()
  File ".venv/lib/python3.10/site-packages/nassl-1.1.3-py3.10-macosx-13-x86_64.egg/nassl/ssl_client.py", line 204, in do_handshake
    self._network_bio.write(handshake_data_in)
SystemError: PY_SSIZE_T_CLEAN macro must be defined for '#' formats

bwbroersma commented 1 year ago

Great analysis @mxsasha. I agree it's probably best to go for sslyze in the current Python setup.

I'm a huge fan of testssl and bash, but the 1.14 MB testssl.sh file with almost 24k of code is insane, even for a bash fan like me.

About the parallel option, that's only used in bulk testing (so this won't work). I'm pretty sure because of this code block testssl.sh#L23856-L23873, you can search for all run_heartbleed occurrences.

You can trace time with:

docker run --rm -e MEASURE_TIME=true -ti drwetter/testssl.sh internet.nl

The 131 seconds are split like this:

name	time
parse	0
prepare_arrays	3
initialized	1
determine_rdns	0
run_protocols	9
run_npn	0
run_alpn	1
run_cipherlists	8
run_server_preference	20
run_fs	19
run_server_defaults	24
do_header	2
run_heartbleed	0
run_ccs_injection	4
run_ticketbleed	2
run_robot	0
run_renego	2
run_crime	0
run_breach	1
run_ssl_poodle	0
run_tls_fallback_scsv	0
run_sweet32	1
run_freak	2
run_drown	0
run_logjam	0
run_beast	0
run_lucky13	1
run_winshock	0
run_rc4	0
run_starttls_injection	0
run_client_simulation	30
run_rating	1

If the code setup is moved more to docker and small tasks, some sub-test could be done with testssl in worker-nodes. Probably if it's properly split in sub-tasks, it can be fast when the sub-tasks run parallel. Since testssl is only a thin wrapper around openssl, it's probably first to have new features.

mxsasha commented 1 year ago

@gthess do you have more thoughts on this that we should know? (We discussed this before, your earlier comments are in the initial description.) TLDR: I think sslyze is quite close to our requirements and plan to do a proof of concept to explore integrating it as an TLS test backend.

gthess commented 1 year ago

Hi @mxsasha, I was following this issue and although I don't have time to go into detail, based on your elaborate analysis I also believe that sslyze seems the way to go. It could also simplify the installation procedure. Being based on nassl, I assume its raw clients could be used to do internet.nl specific tests if they are not supported by sslyze; not preferred but good to know if that is a possibility.

"Limit which ciphers we test for" and "Terminate test on first violation" sound like easy additions to sslyze.

"Cipher order preference on server" not so but that could be a case where the raw nassl clients are used to do manual testing.

My main concern is email testing where most restrictions are required like no parallel connections and fail early. (I believe the "slow_connection" option could mostly get us there but ultimately the proper way to solve this (also related to #670 #889) is to have ratelimiting logic based on IP before starting a test. So if the ratelimit would have been reached the task would be enqueued again. Then different ratelimits could be set for different kind of tests with the mail test being the most conservative I guess. Then most of the heuristics about celery workers and queue lengths could be removed. But I am trailing off on this issue :)

baknu commented 1 year ago

I. It seems the command line option for testing a mail server is --starttls=smtp --certinfo_basic. Source: https://github.com/nabla-c0d3/sslyze/issues/172

II. To prevent running into rate limits there seem to be two useful settings:

--slow_connection

Greatly reduce the number of concurrent connections initiated by SSLyze. This will make the scans slower but more reliable if the connection between your host and the server is slow, or if the server cannot handle many concurrent connections. Enable this option if you are getting a lot of timeouts or errors.

Source: https://github.com/nabla-c0d3/sslyze/issues/385 and https://www.kali.org/tools/sslyze/

An option to control the number of concurrent connections per single server (since version 3.0.0):

Huge improvements to the reliability of the scans: The number of concurrent connections per single server can now be controlled and is set to 5 by default (https://github.com/nabla-c0d3/sslyze/issues/385). This limit is enforced regardless of the number of scan commands queued for the server, and drastically reduces the number of scans that fail due to a slow server or a slow connection.

Source: https://github.com/nabla-c0d3/sslyze/releases/tag/3.0.0

mxsasha commented 9 months ago

Obsoleted by next comment, keeping for reference:

Based on our latest discussions, I've been testing testssl and sslyze tools more, and found the following: * Both miss a key feature: sslyze has no server cipher preference detection, testssl misses zero-rtt. * On https, sslyze is just great. I've never seen it fail and it takes seconds, pretty much always <5. * Testssl works on web, including cipher order. However, it's incredibly slow, particularly if you want to see all ciphers which we do need. Think ~90s for your average web site. * This makes sslyze the more obvious choice, except that mail is a mess. * There are hosts where none of the three (current internet.nl, testssl and sslyze) can run the test. We likely can not fix that. There are also many hosts where all three are decent. * Testssl seems more capable, and completes fine on e.g. mx.ncsc.nl where sslyze times out and breaks, probably due to being more aggressive. Does take long though. * There are also some hosts, at least Fastmail, that our current test can do, but neither testssl nor sslyze manage without getting blocked. I can't determine for sure whether allowlisting may be involved. This is all kind of anecdotal as server behaviour can also be a bit unpredictable. However, it seems sslyze is a great choice for web, testssl is best for mail, and both are missing an essential feature. Thoughts on paths forward: * TLS testing for mail is finicky and has limited reliability. This is a continuous problem for us. Replacing with a standard tool may make this issue a bit worse. * The two tools are strong and weak in separate areas and both have a missing feature. It may be possible to add those, but that will take time. The issues around performance and compatibility with mail servers feel very complex to improve. Therefore, perhaps our best path forward is to utilize both tools depending on the type of test. This is not great - it adds complexity and it means some detection will use different code for web vs mail. But I think it is the least bad so far. This may affect our ability to test some mail servers that accept our current test but not testssl. Alternatives (which all also require implementing zero-rtt in testssl or server cipher preference in sslyze): * We fix either the performance in testssl or the mailserver blocking difficulties in sslyze. I think this will be very difficult. * We accept a significant reduced ability to test mail servers. I think this is worse than two tools. * We accept all web tests will take 60-120s. I think it's too slow. * We cancel the migration to a third party tool and keep maintaining our own code. We try to upgrade to the latest nassl version to reduce the maintenance burden. The difficulty of this is unknown until we try. In addition, we must keep maintaining all our own TLS code forever, which, every time I've dug into it, I've found quite opaque. **Update**: I just now re-read baknu's comment on "2. An option to control the number of concurrent connections per single server (since version 3.0.0)" - there is no command line option so I missed it. Will try it.

mxsasha commented 9 months ago

Update: with the (not exposed through command line) connection limit that I entirely overlooked, sslyze seems to perform for mail similarly to testssl based on my limited dataset. I will move forward with an experimental implementation from there so we can compare it in a larger dataset. Still, it seems there are at least some mail hosts that succeed in current internet.nl, but fail in both testssl and sslyze, and sslyze is still missing cipher preference detection.

baknu commented 9 months ago

Thanks for the update.

with the (not exposed through command line) connection limit that I entirely overlooked, sslyze seems to perform for mail similarly to testssl based on my limited dataset.

Is this the setting that I referred to under 2 in the following comment: https://github.com/internetstandards/Internet.nl/issues/714#issuecomment-1570515269 ?

I will move forward with an experimental implementation from there so we can compare it in a larger dataset.

Nice. We maybe can use the gov domain set and the Tranco 5000 .NL domain set for comparison/reference. See public reports on https://dashboard.internet.nl/ I have asked EJ to enable the mail test for the Tranco list too.

Still, it seems there are at least some mail hosts that succeed in current internet.nl, but fail in both testssl and sslyze,

So does Internet.nl make fewer connections in total to get a result than testssl and sslyze? Or is Internet.nl making fewer connections per time unit? Anyway, Internet.nl does not test for certain less common ciphers (that testssl and sslyze probably both are testing for). From the test explanation on Internet.nl:

Note that ciphers using PSK or SRP for key exchange (which are not sufficiently secure) are not detected in this test due to a limitation related to our testing method.

and sslyze is still missing cipher preference detection.

I believe the issue that is referred to in https://github.com/nabla-c0d3/sslyze/issues/338#issuecomment-762577568 has been encountered and tackled in the Internet.nl project: https://github.com/internetstandards/Internet.nl/issues/461 Maybe our code could be an inpiration to get this fixed in sslyze.

Furthermore, I was thinking about our current "Terminate test on first violation" functionality that you mention in your comment on https://github.com/internetstandards/Internet.nl/issues/714#issuecomment-1448452067. I believe (but am not sure) that this only goes for the "Cipher order" subtest and not for the "Ciphers (Algorithm selections)" subtest. In the test explanation for "cipher order" the following is stated:

In the above table with technical details only the first found out of prescribed order algorithm selections are listed.

mxsasha commented 9 months ago

with the (not exposed through command line) connection limit that I entirely overlooked, sslyze seems to perform for mail similarly to testssl based on my limited dataset.

Is this the setting that I referred to under 2 in the following comment: #714 (comment) ?

Yes.

I will move forward with an experimental implementation from there so we can compare it in a larger dataset.

Nice. We maybe can use the gov domain set and the Tranco 5000 .NL domain set for comparison/reference. See public reports on https://dashboard.internet.nl/ I have asked EJ to enable the mail test for the Tranco list too.

Yes, that would be good.

Still, it seems there are at least some mail hosts that succeed in current internet.nl, but fail in both testssl and sslyze,

So does Internet.nl make fewer connections in total to get a result than testssl and sslyze? Or is Internet.nl making fewer connections per time unit? Anyway, Internet.nl does not test for certain less common ciphers (that testssl and sslyze probably both are testing for). From the test explanation on Internet.nl ...

Those ciphers are now included in at least testssl, but there are so few it is unlikely to make any significant difference. I think our own code makes fewer connections by trying a little harder to get maximum data from each connection. I can't say how big the difference is though and how much effect it has.

Furthermore, I was thinking about our current "Terminate test on first violation" functionality that you mention in your comment on #714 (comment). I believe (but am not sure) that this only goes for the "Cipher order" subtest and not for the "Ciphers (Algorithm selections)" subtest

Ah right, may be.

internetstandards / Internet.nl