oasis-tcs / sarif-spec

OASIS SARIF TC: Repository for development of the draft standard, where requests for modification should be made via Github Issues
https://github.com/oasis-tcs/sarif-spec
Other
166 stars 47 forks source link

SARIF 2.2 proposal: `security-severity` field for reportingDescriptors and results #612

Open adityasharad opened 11 months ago

adityasharad commented 11 months ago

Originally filed as https://github.com/oasis-tcs/sarif-spec/issues/598, split into more focused component issues.

GitHub Advanced Security's code scanning feature recognises security-severity levels, currently read from the properties bag of a SARIF reportingDescriptor or result object, and renders these in the UI. GitHub CodeQL populates this property in its SARIF output, and the property is recognised for other code scanning tools.

Docs: https://docs.github.com/en/code-security/code-scanning/integrating-with-code-scanning/sarif-support-for-code-scanning#reportingdescriptor-object and https://github.blog/changelog/2021-07-19-codeql-code-scanning-new-severity-levels-for-security-alerts/

Using the properties bag was a pragmatic measure to provide this functionality without requiring a SARIF spec change. For SARIF 2.2 I propose we make security-severity an accepted property directly on a reportingDescriptor or a result.

Although GitHub currently accepts numeric values in this property, and translates them on the backend into enum values, we recommend moving away from this for the spec, and instead directly accept an enum of possible values: critical, high, medium, low, or omitted entirely (e.g. for non-security rules/results). SARIF producer tools can supply this property to indicate the tool's own confidence in the security severity of the result, or for a rule, the security severity of the results indicated by this rule. This avoids constraining tool developers into using the same translation methodology that we follow for CodeQL (which is based on CVSS scores, which all producers may not wish to follow). We can update CodeQL to begin populating these enum values using the new property instead of the numeric values it does today.

cc @michaelcfanning

michaelcfanning commented 10 months ago

The four levels of severity here correspond to an ask to add fatal/noncontiuable/critical to the current (business process level) level property. So that consistency is good.

I have two questions: 1) could we render this property simply as severity? does potential severity in a reliability setting, for example, not warrant its own setting? 2) alternately, should we consider simply burning cvss into SARIF as a standard property (the application of which strictly to security is obviously implied). I believe your numeric security-severity is a CVSS value?

ShiningMassXAcc commented 10 months ago

Overall, I think we can have more discussion here.

Ruleset files for VS that are used for both security, quality, and even just style. There are default values for levelthat match to SARIF for what the tool produces, but the local ruleset file is used by clients to determine how this particular codebase or organization interprets that individual rule. https://learn.microsoft.com/en-us/visualstudio/code-quality/using-rule-sets-to-group-code-analysis-rules?view=vs-2022

Thinking even about codeql within Microsoft, we have the idea of sdl-required and sdl-recommended groups of issues. Even with these, it is often up to the individual organizations and business owners to determine which set will result in folks showing up as red in S360.

Overall, my initial inkling is to perhaps add Criticalto Level, although this doesn't lend itself to some of the default rendering, we have in IDEs today. I'd love @michaelcfanning to talk to precedent on using other artifacts outside of the sarif results to determine this interpretation (eg. like https://learn.microsoft.com/en-us/visualstudio/code-quality/using-rule-sets-to-group-code-analysis-rules?view=vs-2022 above)

ShiningMassXAcc commented 10 months ago

Giving a detailed reliability example - if this helps:

My team surfaces crashes in SARIF. There are a few ways that we see dev teams and analytics determine severity.

What do we want SARIF and SARIF clients to be able to do to determine and communicate severity? Having an error level crash versus a warning level crash - seems a bit off compared to medium severity versus high severity crash. Not sure how we imagine these two working in tandem?

michaelcfanning commented 10 months ago

Good example. Our first principle for level is that it is communicates a sort of business process severity. Error breaks the build. Warning does not, etc. The separate proposed severity is an attempt to measure a level that's intrinsic to the problem (assuming it isn't a true positive!). Coupled with something like precision, you might end up with the following sort of thing: We treat this class of problem as a Note because the precision is low and the severity is medium. We treat this other class of finding as Error because the severity is high and the precision is medium. Etc.

@adityasharad, my concrete proposal is that we approve precision and severity (or security-severity if you prefer, and follow the model of rank, i.e., a value of 0 to 100.0 inclusive. Now all these things are consistent. We additional update the boilerplate language for all three of these properties to note that consumers can sort across these knobs as they are expressed. But consumers can also bucket any/all of these properties into low (0-25), medium (26 - 50), high (51 - 75) or critical.

'Very high' may be preferable to 'critical' as a suggested label.

These specific details, i.e., precise information on how to bucketize findings, will address concerns you raised on the call, I hope. @stacywray