Save time / small solution

drwetter commented 7 years ago

Hi,

a proposal for discussion:

There are currently some tests which check vulnerabilities or other tests which use a set of ciphers. Especially for the default run that means to repeatedly check single ciphers over and over again.

Once it's clear that a single cipher (using a protocol) has been used it doesn't have to be tested again. That is a waste of time for the assessment.

There are two ways how to achieve this, maybe also depended on the command line option:

keep a table=array and fill it every time a combination cipher/protocol has been tested successfully of not. Before every test do a lookup in this table
Do upfront at least internally a run with -E and populated this table in the beginning. Use this one for every following test

What do you think?

Cheers, Dirk

PS: WRT 2D arrays: wasn't this one of the things bash3 hiccuped upon? Has Sierra still v3?

AlGreed commented 7 years ago

There are currently some tests which check vulnerabilities or other tests which use a set of ciphers. Especially for the default run that means to repeatedly check single ciphers over and over again.

I am not sure if i understand at least the half, how something works in this script, but:

Do upfront at least internally a run with -E and populated this table in the beginning. Use this one for every following test

If someone does not run all tests, but checks for his case only some vulnerabilities etc, then it will be slower for him, how i understand. :)

keep a table=array and fill it every time a combination cipher/protocol has been tested successfully of not. Before every test do a lookup in this table

Dont we want to remove the duplicated code and move the handling of any cipher in a separate function? Then it will work well.

drwetter commented 7 years ago

If someone does not run all tests, but checks for his case only some vulnerabilities etc, then it will be slower for him, how i understand. :)

There needs to be a point defined when to populate the table upfront and when rather not. That definition is easy if a user is requesting a default run or probably just "PFS" or the POODLE SSL test.

Dont we want to remove the duplicated code and move the handling of any cipher in a separate function? Then it will work well.

Not sure I understand you here. POODLE, BEAST and some other checks work with a predefined list of ciphers. If this list hasn't been tested completely yet, the remainders still need to be tested. IN those case I would not use he term "duplicated code" rather "to-be-modified code".

dcooper16 commented 7 years ago

I'm going to work on implementing this. I've gotten started with this and am following the second option (populate tables up front, then use the table in following tests).

I'm trying to make the initial stage of populating the tables as robust as possible, so that it can handle things such as servers that won't accept more than 128 ciphers in a ClientHello or servers that don't implement version negotiation correctly.

My plan is to first work on the code to populate the tables, and then work on the individual functions one at a time to change them to read from the tables rather than performing their own tests.

dcooper16 commented 7 years ago

I've made the code that I've written for this issue available at https://github.com/dcooper16/testssl.sh/tree/save_time. It is based on the 2.9dev_html branch and it includes PR #656, #662, and #664.

At the moment, at least, the speed improvement is rather modest. About 15% in a full run.

The main advantage is that is simplifies the code. There is a new function, precompute(), that does the upfront work of determining what protocol/cipher suite combinations are supported. So, functions such as run_allciphers(), run_cipher_per_proto(), run_pfs(), run_beast(), and run_rc4(). cipher_pref_check() is also greatly simplified since it doesn't have to deal with the 128 cipher suite limit bug, and it knows in advance whether the server supports any cipher suites that aren't supported by $OPENSSL.

I also took advantage of the upfront testing to make the code more robust in the case of unusual server configurations. For example, if I test rc4.badssl.com using a version of OpenSSL 1.1.0e that has been compiled with RC4 support, determine_optimal_proto() warns the "doesn't seem to be a TLS/SSL enabled server," since $OPENSSL s_client doesn't include RC4 ciphers in the ClientHello unless the -cipher option is used. If I tell testssl.sh to go ahead, a lot of handshake failures occur for the same reason. In this code, since upfront testing determined what ciphers the server supports, determine_optimal_proto() (and other functions such as service_detection()) can include a -cipher option in the calls to $OPENSSL s_client so that the handshakes don't fail.

On the other hand, if I test rc4.badssl.com with a version of OpenSSL 1.1.0e has not been compiled with RC4 support, the code will warn that a version of OpenSSL is being used that doesn't support any of the cipher suites supported by the server (even though it is an SSL/TLS enabled server). So, the user is warned that there will be false negatives/positives, but that they could be avoided if a different OpenSSL were used.

drwetter commented 7 years ago

Thanks!

At the moment, at least, the speed improvement is rather modest. About 15% in a full run.

For testssl.sh --ip=one testssl.net ¹) I got 48 secs vs. 57 secs, a bit more. Looking at your code actually I was thinking on a different implementation which would mean much more rearrangement of the code. That's why I thought that should implemented in next development branch. 2.9dev should settle at a certain point of time and it needs some work to finish the open issues.

Your solution is smaller, so that's fine. Also it saves a bit of time, that's totally cool. If it addresses #660 -- even better!

For the record: I was thinking not doing a connect anymore once we know already the outcome. The lookup should be enough then. However we cannot tell if the user e.g. just wants to test forward secrecy or SSL poodle which likely then would be too costly in terms of time if we precompute everything. For this tests needs be done which gives us an estimate how much time the old implementation in each function costs and then decide which is faster. There should be a global variable as a criteria whether the old individual check will be performed or just a lookup in the array/table. But as said -- stuff for the future.

¹) no heartbeat extension, thus no ~5 seconds wait

dcooper16 commented 7 years ago

However we cannot tell if the user e.g. just wants to test forward secrecy or SSL poodle which likely then would be too costly in terms of time if we precompute everything.

Actually, this isn't true. The upfront testing is performed after the command line is processed, so we know exactly what tests the user wants.

The code I wrote doesn't just perform the equivalent of -E. First, setup_precompute() determines what upfront testing should be performed based on what tests were selected. If only --pfs is selected, then the upfront testing only looks for PFS ciphers (without regard for protocol). If only --poodle is selected, then very little upfront testing is performed (basically just enough to determine what protocols are supported).

run_pfs() (the first part, which lists the supported ciphers), doesn't do any testing; it just looks up the precomputed results. run_ssl_poodle(), on the other hand, first checks to see whether a CBC cipher with SSLv3 was found during the upfront testing. It will also check whether all of the CBC ciphers were tested (with SSLv3) during the upfront testing. If nothing was found during upfront testing, but not all ciphers were checked, then a test is run; otherwise, no test is run.

drwetter commented 7 years ago

Hi David!

However we cannot tell if the user e.g. just wants to test forward secrecy or SSL poodle which likely >> then would be too costly in terms of time if we precompute everything.

Actually, this isn't true. The upfront testing is performed after the command line is processed, so we >know exactly what tests the user wants.

um, yes -- that's not what I meant. We do know it once the program is exec'd but we shouldn't assume that every run is a default run. See last paragraph in https://github.com/drwetter/testssl.sh/issues/666#issue-215252413 .

For your code: I had a quick look at the diff during the weekend, sorry I overlooked the precomputing part. In general that solution seems smart but I don't get yet why it's "only" 10-20% for the whole nine yard run. Will do some measurements how much time is spent per function.

dcooper16 commented 7 years ago

Part of the reason is that the speedup isn't that great is that a lot more internal processing is needed to create and then check the lookup tables. I've been working to optimize that code (and just checked in some more optimizations), but it still takes a significant amount of time.

I also think more can still be done to take advantage of the upfront testing. For example, I haven't changed run_server_defaults() yet, but it seems that this function could take advantage of the upfront testing.

drwetter commented 7 years ago

Please have a look @ #667 -- there you can find the real time killers :-)

drwetter / testssl.sh

Save time / small solution #581