18F / pulse

How the federal .gov domain space is doing at best practices and policies.
Other
94 stars 56 forks source link

Cut SSL Labs scans? #652

Closed konklone closed 6 years ago

konklone commented 7 years ago

They take forever, the results are not needed for compliance, and (as a matter of practicality, since they take so long), they can only be done on the parent domains.

The breakdown of the scans, now that subdomains are added, is roughly like this:

I've bolded the two dominatingly long sections of the scan above. This currently adds up to over 56 hours, meaning that, since we currently scan every 2 or 3 days, we probably have two concurrent scan processes going on sometimes (which could cause race conditions).

The 17.5 hours we spend running pshtt and sslyze are well worth the time, as it's extensive subdomain coverage.

For reference, we don't use the sslyze part in the Pulse dashboard, but GSA staff does use that data in offline analysis to support HTTPS and PKI policy implementation and general understanding of the .gov space. (I haven't broken out how long pshtt takes vs sslyze, but I think they each make up roughly half.)

However, GSA staff never use the SSL Labs data in any analysis, especially since all the relevant data is contained in the sslyze data we collect. The grade is used mostly by government staff outside of the Pulse team. In interactions with agency staff, there has been confusion over whether the grade is relevant to their compliance status, and there has been confusion over whether the grade applies to subdomains of the parent domain.

There are positives -- it's also clear that the grade has motivated technical improvements by agency staff to aspects of their TLS posture. That's a major positive. So, we might consider replacing the grade with our own measure of issues in agency TLS posture. Probably not a grade, but perhaps flagging major issues (such as SHA-1 or lack of TLS 1.2 support).

However, that kind of GSA-driven evaluation does mean we'd want to keep these measures up to date with modern evolving TLS practices. By using SSL Labs, we get an evolving grading process that keeps up with modern best practices, essentially for free.

That all said, that's only driving improvements in parent domains, not subdomains, which are a small fraction of the overall ecosystem. Though, that that said, driving SSL Labs grading awareness up among agencies overall is a good thing and could indirectly drive improvements in subdomains even though we don't measure them.

It's a tough question, and I don't know the right answer. However, I want to raise that the SSL Labs grades continue to play a disproportionate role in our scanning overhead, and they do create some confusion among agencies, even as they do generate some improvements.

cc @h-m-f-t @garretr @alex @smarina04

konklone commented 7 years ago

cc @lachellel @grandamp as well

h-m-f-t commented 7 years ago

@konklone What if the SSL Labs score for parent domains was not part of your mainline scanning process? A score could be cached and a site rescanned at a rate that depends on their grade:

Or something. The assumptions behind this method are that well-configured sites will generally stay well-configured, and that lower-graded sites should be rescanned with greater frequency given that they are being publicly shamed with their poor performance. These may be bad assumptions.

This approach also adds complexity and might not even save time. There are a lot of blank scores present-- though these look to be mostly non-HTTPS sites whose SSL Labs scan probably is done fairly quickly.

konklone commented 7 years ago

Or something. The assumptions behind this method are that well-configured sites will generally stay well-configured, and that lower-graded sites should be rescanned with greater frequency given that they are being publicly shamed with their poor performance. These may be bad assumptions.

That's a neat idea, and maybe there's something viable along those lines -- though as you say, fairly complex. It also would mean the grades wouldn't actually be associated with the main scan time we advertise for the rest of the data at the top of the page.

I would probably be more comfortable just caching all SSL Labs grades for longer and only doing them once a week, though even that would take some work, and would have the same issue described above.

There are a lot of blank scores present-- though these look to be mostly non-HTTPS sites whose SSL Labs scan probably is done fairly quickly.

The blank scores are because in the most recent scan, I nullified the SSL Labs scans without modifying the code or system (/etc/host'd api.ssllabs.com to 127.0.0.) so that the process would complete more quickly while exercising the full production code path. That won't happen going forward.

konklone commented 6 years ago

These are removed.