drwetter / testssl.sh

Testing TLS/SSL encryption anywhere on any port
https://testssl.sh
GNU General Public License v2.0
7.99k stars 1.03k forks source link

Sudden, significant slowdown (7 - 10x slower) when running tests via Docker image as of Apr 1 #2354

Closed mig5 closed 1 year ago

mig5 commented 1 year ago

Versions affected

Docker SHA256s: 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3

Command line / docker command to reproduce

docker run -v /usr/local/etc:/usr/local/etc drwetter/testssl.sh --file /usr/local/etc/ssl_test_list.conf --parallel --fast --quiet --color 0 --ip one --openssl /usr/bin/openssl

Experienced behavior

On Docker SHA256 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e, as at March 31st 3:16AM UTC, scanning 21 sites took 7 minutes, 35 seconds.

On Apr 1, 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 scanning the exact same list of sites at the same time as the previous day, was taking over 27 minutes - I aborted the test.

On Apr 2nd, I reduced the number of sites to scan from 21 to 10, and used the latest Docker image 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 (my script always pulls the latest image before running). It took 16 minutes.

On another system, scanning just 4 sites, the same pattern has occurred: test time went from 1 minute 44s to 7 minutes 38s with the same changes in Docker image version. And yet another system, scanning 6 sites went from 1m 53s to 10m. So I can rule out a problem with the first server running the Docker image - I've reproduced it on 3 separate servers (they are also testing different sites, so we can rule out the other side being the issue), the pattern is the Docker image change.

Expected behavior

I didn't expect the two most recent Docker images to see a drastic slow down in scanning the same sites I've been scanning every night on an automated basis for a couple years. Wondering if something changed in the most recent releases that would explain the slowdown?

Your system (please complete the following information):

drwetter commented 1 year ago

Thx. Can you make our lives easier and at least mention the version of testssl.sh?

mig5 commented 1 year ago

Sorry @drwetter, since it's running in Docker from your official image, I thought the SHAs would be enough for you.

docker run -v /usr/local/etc:/usr/local/etc drwetter/testssl.sh --version is returning the following:

###########################################################
    testssl.sh       3.2rc2 from https://testssl.sh/dev/

      This program is free software. Distribution and
             modification under GPLv2 permitted.
      USAGE w/o ANY WARRANTY. USE IT AT YOUR OWN RISK!

       Please file bugs @ https://testssl.sh/bugs/

###########################################################

 Using "OpenSSL 1.0.2-bad (1.0.2k-dev)" [~183 ciphers]
 on e91b0d963c9f:/home/testssl/bin/openssl.Linux.x86_64
 (built: "Sep  1 14:03:44 2022", platform: "linux-x86_64")
drwetter commented 1 year ago

FYI: There was a bigger change 2 weeks ago. But that doesn't correlate with your report and in the first place it was faster.

I somehow doubt

@polarathene did a lot good things to the Dockerfile here and he's using it Docker images for his project. I suspect when something's fishy he'd speak up ;-)

Other than that: There were only two minor changes the past days, see closed PRs #2353 and #2352 which should have no to little impact.

mig5 commented 1 year ago

OK - sorry to have created noise. The only other common element in the infrastructure (which ranges across different countries, hosting accounts, customers and obviously SSL endpoints) than the same Docker SHAs is me, so maybe it's me :)

I'll close it out, no point if it's only me experiencing it.

polarathene commented 1 year ago

Potentially minor regression

March 31st was our last CI run for the image test, so will need to wait until the next for a comparison:

ok 2 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + RSA in 50186ms
ok 3 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + ECDSA in 46870ms
ok 4 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + ECDSA with RSA fallback in 55187ms
ok 5 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + RSA in 48744ms
ok 6 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + ECDSA in 45785ms
ok 7 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + ECDSA with RSA fallback in 51949ms

That step took 5 minutes which is fairly normal.

Although if I go back to a run before the Dockerfile for drwetter/testssl.sh:3.1dev was switched to openSUSE Leap, prior runs do appear to be finishing in 3 mins 40 sec on average, these are parallel file tests too.

Before switching to the internal version of openssl package from Alpine images via --openssl, it would also take around 5 minutes I think.

ok 2 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + RSA in 37466ms
ok 3 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + ECDSA in 34245ms
ok 4 [Security] (TLS) (cipher lists) 'TLS_LEVEL=intermediate' + ECDSA with RSA fallback in 39373ms
ok 5 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + RSA in 37210ms
ok 6 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + ECDSA in 33201ms
ok 7 [Security] (TLS) (cipher lists) 'TLS_LEVEL=modern' + ECDSA with RSA fallback in 38257ms

Each of those tests is against 7 ports for non-HTTPS services, against each mix of supported cipher-suites that our docker-mailserver image can be configured for.


Unable to reproduce - New image performs better

Running a single site test from my own ageing system locally:

# SHA256 digest: 032002467a862feec3e329720af1c023ec6714d1d98a63e7b030945de656c4ab
$ docker run --rm -it drwetter/testssl.sh:3.1dev github.com
 Done 2023-04-03 21:53:58 [  82s] -->> 20.248.137.48:443 (github.com) <<--

$ docker run --rm -it drwetter/testssl.sh:3.1dev --openssl /usr/bin/openssl github.com
 Done 2023-04-03 22:03:49 [  75s] -->> 20.248.137.48:443 (github.com) <<--

# March 31st:
$ docker run --rm -it drwetter/testssl.sh@sha256:6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e --openssl /usr/bin/openssl github.com
 Done 2023-04-03 22:12:30 [  92s] -->> 20.248.137.48:443 (github.com) <<--

# April 2nd:
$ docker run --rm -it drwetter/testssl.sh@sha256:363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 --openssl /usr/bin/openssl github.com
 Done 2023-04-03 22:15:02 [  75s] -->> 20.248.137.48:443 (github.com) <<--

Those are times that are better than I was getting locally 3 months ago for that same test. Even when compared to the Alpine 3.17 + OpenSSL 3.x image digest from March 31st, the results are still better.


Establishing a reproduction

Clearly since you could reproduce the issue across multiple systems, there's an environment issue? Our Github Actions for CI is running on Ubuntu 22.04 runner, but I wouldn't expect that to make much difference.

My host system is running with Docker Engine v23, Github Actions hasn't yet updated their CI runners to that AFAIK but it is planned.

Could you try reproducing against github.com like shown above?

If you don't get the slowdown there, it may have something to do with the sites you're checking against. Assuming you can get better times with github.com, I think the problem would be the image is now missing something that wasn't an explicit dependency in the Alpine image.


CA certs no longer bundled into image?

docker-mailserver performs a test against a local service with a custom CA, and so that is configured for the testssl.sh container to use for chain of trust to pass. I looked inside the new openSUSE Leap image testssl.sh has been switched to, and we no longer have ca-certificates content you'd expect to find in /etc/ssl.

I actually looked into this (unrelated to testssl.sh) recently, and almost forgot about it :sweat_smile:

Did your output from running testssl.sh change at all? Any errors logged? (I think there are some debug options you could run to improve that) It'd be helpful if it was more than just running slower.

I'd still be interested in knowing what the github.com test time is for you between the two image digests. Afterwards, try build the Dockerfile, but add the ca-certificates-mozilla package after curl on this line:

https://github.com/drwetter/testssl.sh/blob/aa5235e65826ae1104a3ef3d0cba208ef4e4b548/Dockerfile#L14

That might resolve the problem if it was related to CA certs / trust store. If that is not it, might be DNS related, but github.com reproduction results should provide more insights :+1:

I have tried a before / after (with ca-certificates-mozilla added) against a community website I'm an admin for. We use LetsEncrypt there on an openSUSE Leap 15.4 server running docker containers. No change in test time (~160 sec), no change in output.

I also ran the March 31st digest (Alpine image) against that site and got 196 sec. No output difference there either.

mig5 commented 1 year ago

Thanks. Let's assume it's me - I appreciate your help, I've closed the issue out anyway.

I haven't got a lot of time to do a lot of testing right now, but I did run those:

docker run --rm -it drwetter/testssl.sh:3.1dev github.com
 Done 2023-04-03 23:26:43 [  76s] -->> 140.82.121.4:443 (github.com) <<--

docker run --rm -it drwetter/testssl.sh:3.1dev --openssl /usr/bin/openssl github.com
 Done 2023-04-03 23:29:02 [  68s] -->> 140.82.121.4:443 (github.com) <<--

docker run --rm -it drwetter/testssl.sh@sha256:6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e --openssl /usr/bin/openssl github.com
 Done 2023-04-03 23:32:39 [ 176s] -->> 140.82.121.4:443 (github.com) <<--

docker run --rm -it drwetter/testssl.sh@sha256:363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 --openssl /usr/bin/openssl github.com
 Done 2023-04-03 23:33:54 [  69s] -->> 140.82.121.4:443 (github.com) <<--

docker run --rm -it drwetter/testssl.sh@sha256:817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 --openssl /usr/bin/openssl github.com
 Done 2023-04-03 23:35:32 [  69s] -->> 140.82.121.4:443 (github.com) <<--

I've run the 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e several times now and consistenyly get over 170s. It's the clear odd one out. However, I still get slower results than I used to, on 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3, which isn't reproduced with github.com, so I guess it's me.

I'm running the tasks in Jenkins 2.387.1 as a simple Build Exec step, calling out to Docker version 23.0.2, build 569dd73. Neither Docker nor Jenkins versions have changed.

If I get time I'll try some of your other ideas. There is no difference in the output of specific site tests between the 'fast' results last week, and 'slow' results since the weekend - other than the time for any specific site suddenly taking 945s instead of 294s - it's literally the only difference in output.

mig5 commented 1 year ago

OK - potentially useful point: the only other difference in my tests from the Github ones above, is I am running in a 'batch file mode' with a list of sites, and a few extra args like --parallel --fast

The CPU has always got very high with the --parallel mode, I wonder if something changed that makes it contribute to a generalized slow-down for each of the tests.

When I run against 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with just one of my sites, in the same sort of command as the github ones above, it's very fast (93s).

If I add --warnings=batch --parallel --fast --quiet --color 0 --ip one then it gets much slower - 222s. Seems to hang for ages at Testing robust forward secrecy (FS) -- omitting Null Authentication/Encryption, 3DES, RC4

If I remove --parallel, it doesn't help. If I remove --fast, it's suddenly much faster at 48s (the irony is not lost on me), and the 'Testing robust forward secrecy (FS) -- omitting Null Authentication/Encryption, 3DES, RC4` is basically instantaneous instead of a long hang.

If I run the same against github.com, it's not the same story though (with --fast, it is about 20s faster

docker run --rm -it drwetter/testssl.sh@sha256:363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 --warnings=batch --parallel --quiet --color 0 --ip one --openssl /usr/bin/openssl github.com

 Done 2023-04-03 23:58:06 [  73s] -->> 140.82.121.3:443 (github.com) <<--

docker run --rm -it drwetter/testssl.sh@sha256:363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 --warnings=batch --parallel --fast --quiet --color 0 --ip one --openssl /usr/bin/openssl github.com

 Done 2023-04-03 23:59:20 [  59s] -->> 140.82.121.3:443 (github.com) <<--

In short: using --fast suddenly makes it much slower, but apparently, only on my sites. So it's still me I guess. Thanks anyway!

mig5 commented 1 year ago

Without --fast (48s):

 Testing server's cipher preferences 

Hexcode  Cipher Suite Name (OpenSSL)       KeyExch.   Encryption  Bits     Cipher Suite Name (IANA/RFC)
-----------------------------------------------------------------------------------------------------------------------------
SSLv2
 - 
SSLv3
 - 
TLSv1
 - 
TLSv1.1
 - 
TLSv1.2 (server order)
 xc02f   ECDHE-RSA-AES128-GCM-SHA256       ECDH 256   AESGCM      128      TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256              
 xc027   ECDHE-RSA-AES128-SHA256           ECDH 256   AES         128      TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256              
 xc013   ECDHE-RSA-AES128-SHA              ECDH 256   AES         128      TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA                 
 xc030   ECDHE-RSA-AES256-GCM-SHA384       ECDH 256   AESGCM      256      TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384              
 xc028   ECDHE-RSA-AES256-SHA384           ECDH 256   AES         256      TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA384              
 xc014   ECDHE-RSA-AES256-SHA              ECDH 256   AES         256      TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA                 
TLSv1.3
 - 

 Has server cipher order?     yes (OK)

 Testing robust forward secrecy (FS) -- omitting Null Authentication/Encryption, 3DES, RC4 

 FS is offered (OK)           ECDHE-RSA-AES256-GCM-SHA384 ECDHE-RSA-AES256-SHA384 ECDHE-RSA-AES256-SHA ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256
                              ECDHE-RSA-AES128-SHA 

Same site with --fast :

 Testing server's cipher preferences 

Hexcode  Cipher Suite Name (OpenSSL)       KeyExch.   Encryption  Bits     Cipher Suite Name (IANA/RFC)
-----------------------------------------------------------------------------------------------------------------------------
SSLv2
 - 
SSLv3
 - 
TLSv1
 - 
TLSv1.1
 - 
TLSv1.2 (server order)
TLSv1.3
 - 

 Has server cipher order?     yes (OK)

 Testing robust forward secrecy (FS) -- omitting Null Authentication/Encryption, 3DES, RC4 

 FS is offered (OK)           ECDHE-RSA-AES128-GCM-SHA256 ECDHE-RSA-AES128-SHA256 ECDHE-RSA-AES128-SHA ECDHE-RSA-AES256-GCM-SHA384 ECDHE-RSA-AES256-SHA384
                              ECDHE-RSA-AES256-SHA 

I note 2 things: the lack of ciphers listed for TLS1.2 when using --fast, and the difference ciphers chosen for the FS test. Seems to be the difference, and maybe also explains why it's affecting 'me' (or rather, TLS settings on the various appliances I am testing the sites of, which are different appliances but all likely with the same cipher configurations)

polarathene commented 1 year ago

So uhh... my PC crashed when I almost finished responding.

You'll also notice when using --fast with --openssl that you can get errors in the output for SSLv2/v3 detection and RC4 cipher tests as those are not compatible with typical openssl binaries. --fast presumably makes assumptions on what openssl binary it's using, hence why it outputs those errors.


Also, I remember mentioning that ca-certificates / ca-certificates-mozilla won't make a difference right now regardless. I've identified there is a problem installing the package with zypper --installroot during the build. A post-install script runs that tries to access /dev but this is empty in the chroot location (/rootfs).

It'd probably work once it's copied over to the scratch image with a RUN at that point.

drwetter commented 1 year ago

--fast isn't documented, and it's been considered a legacy option for over 3 years with an intent to remove it AFAIK

that's almost correct. I was thinking at some point to fill this option with more useful parts of the scan. --> Matter of 3.3dev.

mig5 commented 1 year ago

Cool. Well, I just wanted to point out that with Docker image SHA256 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e, using --fast against any one of my sites is still nice and, well fast. 112s. That's the last docker image that was working at the expected speed.

Then both the two subsequent Docker images 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 and 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3, with the exact same params (--warnings=batch --parallel --fast --quiet --color 0 --ip one --openssl /usr/bin/openssl), against the exact same site, without any other changes in the command than the SHA, take 228s with that long 'hang' at Testing robust forward secrecy (FS). Unless I remove --fast.

So, I mean, it might be something about my site(s) with these new images, but the actual only variation here is the Docker image of testssl.sh. Something has changed in testssl.sh or some other dependency within the Docker image to cause it. I'd love to know why, because I don't like mysteries, but for now I'm just gonna drop --fast and move on with my life :)

Thanks both

polarathene commented 1 year ago

it might be something about my site(s) with these new images

It doesn't reproduce as slower against the site when not using --mode parallel right? Otherwise can you provide that site or another one for reproduction?

Something has changed in testssl.sh or some other dependency within the Docker image to cause it. I'd love to know why, because I don't like mysteries

I don't know testssl.sh well enough, and the bash script is massive to wade through (not a time sink I want to deal with).

Other than ca-certificates not being present, I'm not sure what else might be missing, or what --fast does differently if the new images perform better when you do a comparison without `--fast.

mig5 commented 1 year ago

Here's a quick and dirty script:

#!/bin/bash

SHAs=(
6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e
817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0
363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3
)

SITE="securedrop.org"

for sha in "${SHAs[@]}"; do
  echo "===> Testing ${sha} with --fast --parallel"
  docker run --rm -it "drwetter/testssl.sh@sha256:${sha}" --warnings=batch --parallel --fast --quiet --color 0 --ip one --openssl /usr/bin/openssl "${SITE}" | grep Done
done

for sha in "${SHAs[@]}"; do
  echo "===> Testing ${sha} with --fast (no --parallel)"
  docker run --rm -it "drwetter/testssl.sh@sha256:${sha}" --warnings=batch --fast --quiet --color 0 --ip one --openssl /usr/bin/openssl "${SITE}" | grep Done
done

for sha in "${SHAs[@]}"; do
  echo "===> Testing ${sha} with --parallel (no --fast)"
  docker run --rm -it "drwetter/testssl.sh@sha256:${sha}" --warnings=batch --parallel --quiet --color 0 --ip one --openssl /usr/bin/openssl "${SITE}" | grep Done
done

for sha in "${SHAs[@]}"; do
  echo "===> Testing ${sha} with neither --parallel nor --fast"
  docker run --rm -it "drwetter/testssl.sh@sha256:${sha}" --warnings=batch --quiet --color 0 --ip one --openssl /usr/bin/openssl "${SITE}" | grep Done
done
root@mig5:~# bash test.sh 
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast --parallel
 Done 2023-04-05 02:51:31 [ 101s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast --parallel
 Done 2023-04-05 02:54:22 [ 170s] -->> 104.22.17.212:443 (securedrop.org) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast --parallel
 Done 2023-04-05 02:57:11 [ 169s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast (no --parallel)
 Done 2023-04-05 02:58:50 [  98s] -->> 172.67.29.166:443 (securedrop.org) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast (no --parallel)
 Done 2023-04-05 03:01:39 [ 169s] -->> 104.22.17.212:443 (securedrop.org) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast (no --parallel)
 Done 2023-04-05 03:04:29 [ 170s] -->> 172.67.29.166:443 (securedrop.org) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --parallel (no --fast)
 Done 2023-04-05 03:06:42 [ 132s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --parallel (no --fast)
 Done 2023-04-05 03:07:58 [  76s] -->> 172.67.29.166:443 (securedrop.org) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --parallel (no --fast)
 Done 2023-04-05 03:09:14 [  76s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with neither --parallel nor --fast
 Done 2023-04-05 03:11:30 [ 136s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with neither --parallel nor --fast
 Done 2023-04-05 03:12:45 [  75s] -->> 104.22.16.212:443 (securedrop.org) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with neither --parallel nor --fast
 Done 2023-04-05 03:14:01 [  75s] -->> 172.67.29.166:443 (securedrop.org) <<--

Another site:

===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast --parallel
 Done 2023-04-05 03:19:43 [ 113s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast --parallel
 Done 2023-04-05 03:22:51 [ 187s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast --parallel
 Done 2023-04-05 03:26:01 [ 189s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast (no --parallel)
 Done 2023-04-05 03:28:07 [ 125s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast (no --parallel)
 Done 2023-04-05 03:31:16 [ 188s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast (no --parallel)
 Done 2023-04-05 03:34:24 [ 187s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --parallel (no --fast)
 Done 2023-04-05 03:36:58 [ 153s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --parallel (no --fast)
 Done 2023-04-05 03:38:28 [  90s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --parallel (no --fast)
 Done 2023-04-05 03:39:57 [  89s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with neither --parallel nor --fast
 Done 2023-04-05 03:42:17 [ 139s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with neither --parallel nor --fast
 Done 2023-04-05 03:43:47 [  89s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with neither --parallel nor --fast
 Done 2023-04-05 03:45:19 [  91s] -->> 52.212.115.228:443 (www.rcplondon.ac.uk) <<--

And a final one (my own site, which like Github.com etc, bucks the trend):

===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast --parallel
 Done 2023-04-05 03:53:26 [ 240s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast --parallel
 Done 2023-04-05 03:56:27 [ 181s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast --parallel
 Done 2023-04-05 03:59:28 [ 180s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --fast (no --parallel)
 Done 2023-04-05 04:03:33 [ 245s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --fast (no --parallel)
 Done 2023-04-05 04:06:34 [ 180s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --fast (no --parallel)
 Done 2023-04-05 04:09:35 [ 181s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with --parallel (no --fast)
 Done 2023-04-05 04:14:52 [ 316s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with --parallel (no --fast)
 Done 2023-04-05 04:19:02 [ 250s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with --parallel (no --fast)
 Done 2023-04-05 04:23:11 [ 249s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e with neither --parallel nor --fast
 Done 2023-04-05 04:28:29 [ 317s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 817767576e2026de28b496feead269e873fdb7338f3adc56502adb6494c39cd0 with neither --parallel nor --fast
 Done 2023-04-05 04:32:37 [ 247s] -->> 72.14.176.166:443 (mig5.net) <<--
===> Testing 363d162b04a483826bb91c2e04c3498d16d60b3a953fd599b3cb0e8dc9076eb3 with neither --parallel nor --fast
 Done 2023-04-05 04:36:49 [ 251s] -->> 72.14.176.166:443 (mig5.net) <<--

This seems to tell us - for the sites that this does affect:

1) --parallel being present or not, makes little difference for any of the SHAs here - probably because testing just one site and not several

2) --fast when it is dropped, is faster on the latter two SHAs, but is actually slower on the first (older) SHA 6569d5d3e4ab812e01590aae6a3ad23c69d45b052fcc8436c3d4af13e88cac5e.

To me this makes the older, March SHA more consistent with expectations (you would expect something called --fast to be faster when set, and slower when unset)

I reproduce this on a couple of sites such as securedrop.org and www.rcplondon.ac.uk and a number of others, all of which completely vary in their infrastructure (some are on Google Cloud in k8s clusters, others are at AWS London in ECS autoscale clusters, others are standalone EC2 instances in AWS Ireland) . Some of those I don't even have access to direct (but my client does, and I help with monitoring of them), so we can rule out any configuration that 'I' would typically apply on them.

However, I indeed don't reproduce it on github.com, microsoft.com, cloudflare.com, and certain other sites that I host (such as mig5.net above). In those other sites, the newer Docker SHAs are faster than the older one, and on the older one, --fast does make it faster.

It seems to therefore be site-specific but always a reproducible pattern on the Docker SHAs for those sites.

polarathene commented 1 year ago

Thanks for providing more information :+1:

To me this makes the older, March SHA more consistent with expectations (you would expect something called --fast to be faster when set, and slower when unset)

@drwetter probably knows better what --fast is doing and if it's presently expected to work with any openssl binary.

One difference with Alpine would have been it recently got OpenSSL 3.x with the Alpine 3.17 release, but prior was also on OpenSSL 1.x. It also performs DNS lookups differently IIRC (which was a compatibility problem with the old openssl binary from testssl.sh if not using --openssl /usr/bin/openssl)

I don't know what you're running on your host, but you could probably try running testssl.sh there outside of the container where base image is no longer relevant. If that's also regressed, then it's likely something to do with Alpine that was benefiting you.


Update

I ran the tests on my end to securedrop.org and got 270s (Alpine) vs 620s (Leap). That site runs three tests where the output hangs for 2 minutes here each time:

 Testing server's cipher preferences 

Hexcode  Cipher Suite Name (OpenSSL)       KeyExch.   Encryption  Bits     Cipher Suite Name (IANA/RFC)
-----------------------------------------------------------------------------------------------------------------------------
SSLv2
 - 
SSLv3
 - 
TLSv1
 - 
TLSv1.1
 - 
TLSv1.2 (server order)
TLSv1.3
 - 

 Has server cipher order?     yes (OK)
 -- inconclusive test, matching cipher in list missing, better see above

 Testing robust forward secrecy (FS) -- omitting Null Authentication/Encryption, 3DES, RC4 

I'm not sure if --fast is compatible with the openSUSE OpenSSL packages, but using the openssl build testssl.sh provides by default magically fixes the problem :sweat_smile:

Fixes:

drwetter commented 1 year ago

1) please don't use --fast anymore. If it's broken, we won't fix it. 2) --parallel is for multiple hosts only. If you're using it for other than that and encounter strange performance issues, I can't help it. 3) The local certificates which come with an image or an OS aren't/shouldn't be used, also if they would be available. We have our own certificate stores.

polarathene commented 1 year ago
  1. please don't use --fast anymore. If it's broken, we won't fix it.

2. --parallel is for multiple hosts only. If you're using it for other than that and encounter strange performance issues, I can't help it.

You don't need --parallel to reproduce.

polarathene commented 1 year ago

--ssl-native also triggers a reproduction. Pretty sure the cause is around here:

10403:     "$SSL_NATIVE" && using_sockets=false
10404-     "$FAST" && using_sockets=false
10405-     [[ $TLS_NR_CIPHERS == 0 ]] && using_sockets=false
10406-
10407-     outln
10408-     pr_headline " Testing robust forward secrecy (FS)"; prln_underline " -- omitting Null Authentication/Encryption, 3DES, RC4 "
10409-     if ! "$using_sockets"; then
10410:          [[ $TLS_NR_CIPHERS == 0 ]] && ! "$SSL_NATIVE" && ! "$FAST" && pr_warning " Cipher mapping not available, doing a fallback to openssl"
10411-          if ! "$HAS_DH_BITS" && "$WIDE"; then
10412:               [[ $TLS_NR_CIPHERS == 0 ]] && ! "$SSL_NATIVE" && ! "$FAST" && out "."
10413-               pr_warning "    (Your $OPENSSL cannot show DH/ECDH bits)"
10414-          fi
10415-          outln
10416-     fi
10470-     else
10471-          debugme echo $nr_supported_ciphers
10472-          debugme echo $(actually_supported_osslciphers $fs_cipher_list "ALL")
10473-          if [[ "$nr_supported_ciphers" -le "$CLIENT_MIN_FS" ]]; then
10474-               outln
10475-               prln_local_problem "You only have $nr_supported_ciphers FS ciphers on the client side "
10476-               fileout "$jsonID" "WARN" "tests skipped as you only have $nr_supported_ciphers FS ciphers on the client site. ($CLIENT_MIN_FS are required)"
10477-               return 1
10478-          fi
10479-          # By default, OpenSSL 1.1.1 and above only include a few curves in the ClientHello, so in order
10480-          # to test all curves, the -curves option must be added. In addition, OpenSSL limits the number of
10481-          # curves that can be specified to 28. So, if more than 28 curves are supported, then the curves must
10482-          # be tested in batches.
10483-          curves_list1="$(strip_trailing_space "$(strip_leading_space "$OSSL_SUPPORTED_CURVES")")"
10484-          curves_list1="${curves_list1//  / }"
10485-          if [[ "$(count_words "$OSSL_SUPPORTED_CURVES")" -gt 28 ]]; then
10486-               # Place the first 28 supported curves in curves_list1 and the remainder in curves_list2.
10487-               curves_list2="${curves_list1#* * * * * * * * * * * * * * * * * * * * * * * * * * * * }"
10488-               curves_list1="${curves_list1%$curves_list2}"
10489-               curves_list1="$(strip_trailing_space "$curves_list1")"
10490-               curves_list2="${curves_list2// /:}"
10491-          fi
10492-          curves_list1="${curves_list1// /:}"
10493-          $OPENSSL s_client $(s_client_options "-cipher $fs_cipher_list -ciphersuites ALL $STARTTLS $BUGS -connect $NODEIP:$PORT $PROXY $SNI") >$TMPFILE 2>$ERRFILE </dev/null
10494-          sclient_connect_successful $? $TMPFILE
10495-          sclient_success=$?
10496-          [[ $sclient_success -eq 0 ]] && [[ $(grep -ac "BEGIN CERTIFICATE" $TMPFILE) -eq 0 ]] && sclient_success=1
10497-          # Sometimes a TLS 1.3 ClientHello will fail, but a TLS 1.2 ClientHello will succeed. See #2131.
10498-          if [[ $sclient_success -ne 0 ]]; then
10499-               curves_option="-curves $curves_list1"
10500-               $OPENSSL s_client $(s_client_options "-cipher $fs_cipher_list $curves_option $STARTTLS $BUGS -connect $NODEIP:$PORT $PROXY $SNI") >$TMPFILE 2>$ERRFILE </dev/null
10501-               sclient_connect_successful $? $TMPFILE
10502-               sclient_success=$?
10503-               [[ $sclient_success -eq 0 ]] && [[ $(grep -ac "BEGIN CERTIFICATE" $TMPFILE) -eq 0 ]] && sclient_success=1
10504-               if [[ $sclient_success -ne 0 ]] && [[ -n "$curves_list2" ]]; then
10505-                    curves_option="-curves $curves_list2"
10506-                    $OPENSSL s_client $(s_client_options "-cipher $fs_cipher_list $curves_option $STARTTLS $BUGS -connect $NODEIP:$PORT $PROXY $SNI") >$TMPFILE 2>$ERRFILE </dev/null
10507-                    sclient_connect_successful $? $TMPFILE
10508-                    sclient_success=$?
10509-                    [[ $sclient_success -eq 0 ]] && [[ $(grep -ac "BEGIN CERTIFICATE" $TMPFILE) -eq 0 ]] && sclient_success=1
10510-               fi
10511-          fi
10512-     fi

That last conditional appears to be due to the host securedrop.org not looking like it supports TLS 1.3 (why?), and not using sockets which both --fast and --ssl-native opt-out. Hence why some other sites like github.com don't regress with these options.

But as mentioned, this is not a bug in testssl.sh AFAIK, just something wrong with how OpenSSL is built / configured in openSUSE Leap.

@mig5 if your tested website offers TLS 1.3 it'd avoid that branch and should work with --fast like you expect.

Alternatively run without --openssl which shouldn't be necessary now AFAIK, in the past with Alpine it provided better performance and avoided some DNS lookup related bugs due to the bundled testssl.sh openssl binary being intended for glibc not musl.

I got 200s which is faster than what you'd have on Alpine before (based on my prior timings, yours will be lower than this).

mig5 commented 1 year ago

@mig5 if your tested website offers TLS 1.3 it'd avoid that branch and should work with --fast like you expect.

Indeed, I wondered if this might be the thing. My own website (mig5.net) supports TLS 1.3 and of course github.com etc, do too. The sites where I first noticed this, are using ALBs at Amazon that were on an older set of ciphersuites etc (TLS1.3 only became possible on ALBs last month!). securedrop.org is not one of those but is probably a similar story (slightly surprised, as it's using Cloudflare, but probably is a setting somewhere, yep)

Thanks for all that debugging. For now I've dropped --fast and things are of course great, but going to look into more detail about dropping --openssl and also of course adding TLS1.3 wherever I have the power to!