ossf / scorecard-webapp

Website and API for OpenSSF Scorecard
https://scorecard.dev
Apache License 2.0
22 stars 27 forks source link

"Risk level" highly misleading (Dangerous-Workflow always "CRITICAL"). Scorecard report ("webviewer") seems broken #622

Open Chealer opened 5 months ago

Chealer commented 5 months ago

OpenSSF Scorecard reports, such as Linux's, contain a number of checks, each of which has a name, a score and what is called a "risk level". For example, Linux's report has a check with the name Dangerous-Workflow, the score 10 and the "Risk level" "CRITICAL".

While the label "CRITICAL" seems to indicate a grave issue, that is not (necessarily) the case. It took me about 2 minutes to figure out what this means...

In fact, the score and the risk level are not proportional. It is not a defect that a given check has both a 10/10 score and a "CRITICAL" tag (although it is a design bug). All products have the "Risk level" "CRITICAL" for the Dangerous-Workflow check. It turns out that "Risk level" is a basic Scorecard concept, introduced in the homepage's How it works section. This is the same confusion as the one reported on 2023-09-12 by @evverx in ticket ossf/scorecard#2979 (he calls it "severity").

Representing risks is non-trivial and being new to Scorecard, I am not in a great position to advise, but for sure the property currently named "Risk level" should be renamed. My understanding is that if 1. a check's assessment is correct and 2. the check assesses a degree of risk, then the checked aspect can represent a LOW/MEDIUM/HIGH/CRITICAL level of risk. I'd suggest something like "Importance of check", for want of a more functional name.

In line with that, the values should also be relabelled. Since they do not represent actual risk, they should just describe importance, for example:

  1. low
  2. medium
  3. high
  4. highest

Correspondingly, the visual representation should be adjusted (perhaps using size rather than colors to distinguish). Putting that property in its own column would also greatly help readers understand what it means.

spencerschrock commented 4 months ago

Moved this over to the relevant repo, as the report concerns the web viewer and while the risk levels are a scorecard concept, they're not present in the other output formats.

This issue got mentioned in our bi-weekly dev syncs today, so others certainly share your opinion. In terms of how Scorecard uses the value when calculating scores, I think importance is a good word. I think the existing values are fine for describing those as well.

Would changing Risk level -> Importance level in the webviewer be sufficient for avoiding the confusion?

evverx commented 4 months ago

Would changing Risk level -> Importance level in the webviewer be sufficient for avoiding the confusion?

I'm not sure it would. "10 dangerous-workflow critical" is still confusing in the sense that it isn't clear whether it's good or bad. It isn't clear whether it should be acted on either. As far as I understand the idea is to redesign things a bit

("importance" sounds better but I'm not a native English speaker so it's not that different to me. Though it's certainly better than "severity" :-))

(To me it was OK at the time because those dashboards replaced machine-readable dumps like https://api.securityscorecards.dev/projects/github.com/systemd/systemd and made it all more or less human-friendly and then I probably got used to it)

lelia commented 4 months ago

From a UX perspective, I think the sheer fact that "Dangerous Workflow" is currently the only category classed as "Critical" combined with the fact that the default webapp sort view (Risk level: descending) places it at the very top makes for a very confusing experience for new Scorecard users:

Screenshot 2024-05-15 at 3 11 20 PM

I do see that one additional "Critical" check for "Webhooks" is planned (noted as experimental in the README), so perhaps this experience will become less misleading once more "Critical" checks populate the webapp view, but I'm not sure what the timeline is on this.

Consider the Gutenberg principle, which roughly maps out average human eye movement patterns when viewing a page:

reading_gravity_01

If "Dangerous Workflow" is going to remain the sole "Critical" check for the foreseeable future, perhaps it makes sense to swap the locations of the "Risks" levels and the numerical scores, so that Risks sit to the left of the check name (the "weak fallow" area) and the more actionable score numbers migrate towards the far right (the "strong fallow/terminal" areas).