Open adityasharad opened 1 year ago
The four levels of severity here correspond to an ask to add fatal/noncontiuable/critical to the current (business process level) level
property. So that consistency is good.
I have two questions: 1) could we render this property simply as severity
? does potential severity in a reliability setting, for example, not warrant its own setting? 2) alternately, should we consider simply burning cvss
into SARIF as a standard property (the application of which strictly to security is obviously implied). I believe your numeric security-severity
is a CVSS value?
Overall, I think we can have more discussion here.
We've talked primarily about level
but there is also kind
'above' level
and rank
'below' level
.
+1 to Michael's comments about other domains, like reliability and privacy, that also use static analysis. Having a security
specific item to me doesn't help align the larger ecosystem when thinking about this. The conversations about cvss
being a bit polarizing, I think highlight this.
For me, this determination really lives outside of the result schema today. Here are a few examples that come to mind.
Ruleset files for VS that are used for both security, quality, and even just style. There are default values for level
that match to SARIF for what the tool produces, but the local ruleset file is used by clients to determine how this particular codebase or organization interprets that individual rule.
https://learn.microsoft.com/en-us/visualstudio/code-quality/using-rule-sets-to-group-code-analysis-rules?view=vs-2022
Thinking even about codeql within Microsoft, we have the idea of sdl-required and sdl-recommended groups of issues. Even with these, it is often up to the individual organizations and business owners to determine which set will result in folks showing up as red in S360.
Overall, my initial inkling is to perhaps add Critical
to Level
, although this doesn't lend itself to some of the default rendering, we have in IDEs today. I'd love @michaelcfanning to talk to precedent on using other artifacts outside of the sarif results to determine this interpretation (eg. like https://learn.microsoft.com/en-us/visualstudio/code-quality/using-rule-sets-to-group-code-analysis-rules?view=vs-2022 above)
Giving a detailed reliability example - if this helps:
My team surfaces crashes in SARIF. There are a few ways that we see dev teams and analytics determine severity.
What do we want SARIF and SARIF clients to be able to do to determine and communicate severity? Having an error
level
crash versus a warning
level
crash - seems a bit off compared to medium
severity
versus high
severity
crash. Not sure how we imagine these two working in tandem?
Good example. Our first principle for level
is that it is communicates a sort of business process severity. Error
breaks the build. Warning
does not, etc. The separate proposed severity is an attempt to measure a level that's intrinsic to the problem (assuming it isn't a true positive!). Coupled with something like precision, you might end up with the following sort of thing: We treat this class of problem as a Note
because the precision is low
and the severity is medium
. We treat this other class of finding as Error
because the severity
is high and the precision is medium
. Etc.
@adityasharad, my concrete proposal is that we approve precision
and severity
(or security-severity
if you prefer, and follow the model of rank
, i.e., a value of 0 to 100.0 inclusive. Now all these things are consistent. We additional update the boilerplate language for all three of these properties to note that consumers can sort across these knobs as they are expressed. But consumers can also bucket any/all of these properties into low (0-25), medium (26 - 50), high (51 - 75) or critical.
'Very high' may be preferable to 'critical' as a suggested label.
These specific details, i.e., precise information on how to bucketize findings, will address concerns you raised on the call, I hope. @stacywray
Originally filed as https://github.com/oasis-tcs/sarif-spec/issues/598, split into more focused component issues.
GitHub Advanced Security's code scanning feature recognises
security-severity
levels, currently read from theproperties
bag of a SARIFreportingDescriptor
orresult
object, and renders these in the UI. GitHub CodeQL populates this property in its SARIF output, and the property is recognised for other code scanning tools.Docs: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#reportingdescriptor-object and https://github.blog/changelog/2021-07-19-codeql-code-scanning-new-severity-levels-for-security-alerts/
Using the properties bag was a pragmatic measure to provide this functionality without requiring a SARIF spec change. For SARIF 2.2 I propose we make
security-severity
an accepted property directly on a reportingDescriptor or a result.Although GitHub currently accepts numeric values in this property, and translates them on the backend into enum values, we recommend moving away from this for the spec, and instead directly accept an enum of possible values:
critical
,high
,medium
,low
, or omitted entirely (e.g. for non-security rules/results). SARIF producer tools can supply this property to indicate the tool's own confidence in the security severity of the result, or for a rule, the security severity of the results indicated by this rule. This avoids constraining tool developers into using the same translation methodology that we follow for CodeQL (which is based on CVSS scores, which all producers may not wish to follow). We can update CodeQL to begin populating these enum values using the new property instead of the numeric values it does today.cc @michaelcfanning