Consider a more aggressive SLA

alex commented 7 years ago

👋 I think current discussions accurately capture that industry standard (as seen in places like Project Zero, but which some vendors sadly still miss) is around 90 days as an SLA for remediation.

I'd like to introduce a new idea: while that may be the standard, it's not as good as we should be doing. My theory for this is informed by research such as "Security impact ratings considered harmful", which concludes that our ability to assess the potential impact of a vulnerability is imperfect, and we tend to fail towards underestimating their impacts. Similarly, our own monitoring is imperfect, and we may not be aware a vulnerability is in active use. Therefore I conclude that the industry standard of 90 days is probably far too high.

As this program represents a fresh opportunity to reconsider, I'd like to encourage 18F/TTS to adopt a more aggressive SLA, even as little as choosing 85 days instead of 90 days sets the clear message that the SLA is a starting point, and something to be continually improved on and driven downwards.

jacobian commented 7 years ago

Is anyone aware of any research into the effectiveness of various patch management timelines? Verizon's 2017 DBIR has some data that shows that most organizations patch the majority of their issues (94%) within 12 weeks (84 days), but that doesn't say anything about if that's effective, just that most organizations do so (and, we all know that the DBIR has some some dodgy data issues in the past).

I'd love to make a data-driven decision here, but I can't seem to find any data :(

jacobian commented 7 years ago

FWIW, CERT has a 45-day disclosure timeline.

jacobian commented 7 years ago

So here's a fairly different idea: what if instead of a worst-case "deadline" SLA, we instead had a percentile-based one? That is, instead of "all issues resolved within 90 days", we had "p95 resolution time of 60 days" (or whatever)?

What's nice about this is that it makes outliers less of a big deal, while also letting us push more aggressively on the common-case. The problem with a deadline SLA is that it needs to take into account worst-cases, but I'm frankly a lot more interested in the average case. My experience is that the vast majority of issues can be resolved pretty quickly, but the weird ones make tighter deadlines less tenable.

@commit-dkp @kimberbat thoughts?

alex commented 7 years ago

I think as an internal measure, p95 (or some other percentile) is a great way of tracking success, probably more so than a fixed SLA. However, I'm not sure it addresses the problem I was thinking about.

When I wrote up the issue I was thinking about the SLA as the internal counterpart for the disclosure timeline requested/imposed on reporters. That is, the SLA is the time to beat so that we don't agree that public disclosure by the reporter is legitimate. For the purposes of keeping communication straightforward, I think that relationship ("how long does the reporter have to keep this private") probably does need a fixed number. And that's the number that I'm particularly interested in finding ways to push downwards.

commit-dkp commented 5 years ago

The TTS vulnerability disclosure policy at https://github.com/18F/vulnerability-disclosure-policy/blob/master/vulnerability-disclosure-policy.md#coordinated-disclosure explicitly acknowledges that reporters are under no obligation with TTS to keep anything private. It invites coordinated disclosure, but does not require it.

afeld commented 4 years ago

Closing as stale.

afeld commented 4 years ago

But thanks so much for raising it!

18F / bug-bounty

Consider a more aggressive SLA #24