Closed Dentrax closed 12 months ago
@Dentrax You can get SARIF output and scanning GH action using betterscan https://github.com/marcinguy/betterscan-ce
It has OSV scanner as one the scanners
Hey @marcinguy, since this is slightly out-of-context of what I'm proposing here; but I definitely check that project, seems interesting! Thanks for dropping that.
@Dentrax No worries. Was trying to help. Yea, different outputs from OSV scanner would be great. +1
Thanks for the suggestion! This is indeed something we've been thinking about. This is something that may be blocked on #150.
Can you also speak a bit about how you intend to consume the SARIF output? Is this meant to be fed to e.g. GitHub's code scanning alerts? Are there other use cases for this?
Is this meant to be fed to e.g. GitHub's code scanning alerts?
Actually, yes. This is my only use-case.
Hey guys, any update on SARIF support?
I'm also interested in this feature. It makes integrations with CI/CD tools a way nicer!
This is already being worked on by @another-rex in https://github.com/google/osv-scanner/pull/420.
I tested this and it outputs to sarif format great, but probably not quite what I expected. I'm going through osv which will check for say 10 issues, and I'm expecting all 10 to be expanded and displayed (in order to be put into results), but at the moment it looks like it's just writing a result and printing a table in the message. This should not be the desired result for this requirement.
I tested this and it outputs to sarif format great, but probably not quite what I expected. I'm going through osv which will check for say 10 issues, and I'm expecting all 10 to be expanded and displayed (in order to be put into results), but at the moment it looks like it's just writing a result and printing a table in the message. This should not be the desired result for this requirement.
Thanks for the feedback @Ma1tobiose. Can you please give a sample output that you'd like to see?
Also curious if there's feedback from any others in this issue. e.g. @pwa-tapptic , @jaskaransinghdr6j @Dentrax
I tested it on my side and I'd have same remark as @Ma1tobiose .
https://github.com/anchore/grype creates SCA results in SARIF format quite nicely (syft . -o json | grype -q -o sarif
).
As @Ma1tobiose said, OSV compresses everything into single finding in SARIF, but findings should be rather separate objects in results
array:
{
"version": "2.1.0",
"$schema": "https://json.schemastore.org/sarif-2.1.0-rtm.5.json",
"runs": [
{
"tool": { ...
},
"results": [
{
"ruleId": "GHSA-72xf-g2v4-qvf3-tough-cookie",
"message": {
"text": "The path /yarn.lock reports tough-cookie at version 4.1.2 which would result in a vulnerable (npm) package installed"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "/yarn.lock"
},
"region": {
"startLine": 1,
"startColumn": 1,
"endLine": 1,
"endColumn": 1
}
}
}
]
},
{
"ruleId": "GHSA-776f-qx25-q3cc-xml2js",
"message": {
"text": "The path /yarn.lock reports xml2js at version 0.4.23 which would result in a vulnerable (npm) package installed"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "/yarn.lock"
},
"region": {
"startLine": 1,
"startColumn": 1,
"endLine": 1,
"endColumn": 1
}
}
}
]
},
...
just to give you an example.
Thanks!
Our motivation behind a single aggregated report, is to better connect this to #352 via a single remediation command included in the result (e.g. "to fix these interactively, run osv-scanner fix ...
"), where we provide a tool that can help with interactively (or automatically) addressing multiple vulnerabilities at once in the same lockfile/manifest file.
Could we understand a bit better how having separate results helps your use cases? Does this provide better UX for you to look at vulnerability results individually as opposed to a single one?
Thanks! @oliverchang I can understand that osv will store the data structure in an aggregated form in order to be able to fix it in one click. But if the output is sarif or json, my side hope is to access the CICD process, osv as one of the tools in the output standardized report, so that it is easy to do the aggregated display with the results of other tools. So I would need a report in a format similar to the one provided by @pwa-tapptic (results in a separate form).
@oliverchang one-click fix is an awesome feature and great motivation behind tool as OSV, it must be seen as something valuable from CLI user perspective. But SARIF (i.e. Static Analysis Results Interchange Format) has slightly different motivations behind (details here), as the name of the format itself suggests. CI/CD tools are one of potential beneficiaries of the standardized interchange format.
@another-rex let's investigate if there's a good middle ground here.
+1 SARIF should ideally represent each "finding" as it's own line item. Being an interchange format, it would be right to use it as a 1 CVE per item template.
Here is the current format I am thinking of merging, would love any feedback (fyi: @pwa-tapptic @jaskaransinghdr6j @Ma1tobiose)
{
"version": "2.1.0",
"$schema": "https://json.schemastore.org/sarif-2.1.0.json",
"runs": [
{
"tool": {
"driver": {
"informationUri": "https://github.com/google/osv-scanner",
"name": "osv-scanner",
"rules": [
{
"id": "RUSTSEC-2022-0041",
"shortDescription": {
"text": "OSV.Summary"
},
"fullDescription": {
"text": "OSV.Details",
"markdown": "OSV.Details"
},
"help": {
"text": "...",
"markdown": "Markdown table of occurrences in the repo (i.e. If this vuln affects multiple packages/lockfiles)"
}
}
]
}
},
"artifacts": [
{
"location": {
"uri": "path/to/Cargo.lock"
},
"length": -1
}
],
"results": [
{
"ruleId": "RUSTSEC-2023-0045",
"ruleIndex": 0,
"level": "warning",
"message": {
"text": "Package 'memoffset@0.6.1' is vulnerable to 'RUSTSEC-2023-0045', please upgrade to versions '0.6.2' to fix this vulnerability"
},
"locations": [
{
"physicalLocation": {
"artifactLocation": {
"uri": "path/to/Cargo.lock"
}
}
}
]
}
// ... One result per vulnerable package per vulnerability
]
}
]
}
+1 SARIF should ideally represent each "finding" as it's own line item. Being an interchange format, it would be right to use it as a 1 CVE per item template.
This is complicated by the fact that not every OSV entry has a corresponding CVE (e.g. RUSTSEC-2023-0045), and we can have multiple advisories from different sources referring to the exact same vulnerability.
This is common for vulnerabilities published by github advisories, where for some languages there will be security advisories published by the language maintainers as well.
One solution to this we are thinking is to fill "ruleId" with all the aliased vulnerabilities joined together (e.g. id: GHSA-wfg4-322g-9vqv,RUSTSEC-2023-0045
). SARIF provides a 3.49.4 deprecatedIds property
which is an array of unique IDs, in which we can separate those IDs out into separate array entries.
Tell us if this will work for your workflows!
This is great! However, I feel as many findings aggregators will be using these findings to autofill fields like "Finding Title", the joined ruleId might be a bit too onerous.
There seems to be no straightforward way to deal with this according to the SARIF spec. What if we default the ruleId to "CVE-XXXX" if available, and fallback to package specific id "RUSTSEC-YYYY" if the former is not available? (or vice-versa)
The other approach could be to list all the aliases in the message text field as : "Also Reported as RUSTSEC-YYYY
and GHSA-ZZZZ
"
The formatting is great, and I agree with @jaskaransinghdr6j's solution (if it's feasible), being able to easily query the cve and refer sources would make it much quicker to find vulnerability details and fixes.
There seems to be no straightforward way to deal with this according to the SARIF spec. What if we default the ruleId to "CVE-XXXX" if available, and fallback to package specific id "RUSTSEC-YYYY" if the former is not available? (or vice-versa)
I implemented this in the SARIF PR and it's now been merged, so please try it out in latest main branch of osv-scanner (this change is not in 1.4.0) and see if this fits your use cases!
It'd be great to have a SARIF output format to upload the results to GitHub. (i.e. with
github/codeql-action/upload-sarif
action)Blocked by #217