kubernetes / sig-security

Process documentation, non-code deliverables, and miscellaneous artifacts of Kubernetes SIG Security
Apache License 2.0
172 stars 57 forks source link

[govulncheck] Generate VEX documents from `govulncheck` output #116

Open PushkarJ opened 5 months ago

PushkarJ commented 5 months ago

WHAT

As part of #95 we have now setup govulncheck to run on each PR and periodically on master + stable release branches as part of verify jobs.

govulncheck has now added support for openvex: https://pkg.go.dev/golang.org/x/vuln@v1.1.2/internal/openvex

We should explore if it make sense to add VEX documents as part of each of our releases going forward.

WHY

This will partially solve the issue of k8s maintainers being requested to provide input on whether a specific CVE is affecting k/k or not by preemptively generating VEX documents for the CVEs where Kubernetes is unaffected. This will also allow us to codify the policy mentioned here: https://github.com/kubernetes/community/blob/6c75205e1b67a84d5784502dd27d1a0e04192021/contributors/devel/sig-release/cherry-picks.md?plain=1#L65

To illustrate the point, dependency updates that just aim to silence some scanners and do not fix any vulnerable code are NOT eligible for cherry-picks.

Some examples:

https://github.com/kubernetes/kubernetes/issues/121370 https://github.com/kubernetes/kubernetes/issues/122424 https://github.com/kubernetes/kubernetes/issues/119227 https://github.com/kubernetes/kubernetes/issues/122952 and many more before govulncheck was introduced

HOW

We need a trusted way to generate the VEX document, where following properties are desired:

Open to ideas on the how and we can all explore possible options together.

WHO

This would need collaboration between SIG Security, Docs, Architecture and Release.

WHERE

We may potentially host it besides https://kubernetes.io/docs/reference/issues-security/official-cve-feed/ but of course other ideas or placements are welcome!

NOTES

Part of #3, related https://github.com/kubernetes/kubernetes/issues/121454

Work being Done

/sig security architecture docs release /area dependency /kind feature

ritazh commented 5 months ago

We may potentially host it besides https://kubernetes.io/docs/reference/issues-security/official-cve-feed/

+1

What are other potential places to push this artifact so scanners can automatically consume them?

PushkarJ commented 5 months ago

To answer Rita's question and get a sense of how the actual output looks like I tried this on k/k master and release-1.30

main branch Results

govulncheck -scan package -format openvex ./... > openvex-commit-34b8832edb00c83ba50b88ea9eefb608d3247131.out

openvex-commit-34b8832edb00c83ba50b88ea9eefb608d3247131-main.out.txt

Only 2 out of 153 CVEs were marked as affected according to govulncheck@v1.1.2 🎉

Further drilling down found the following information on both:

GO-2024-2527

https://github.com/etcd-io/etcd/security/advisories/GHSA-5x4g-q5rc-36jp

Being discussed as contentious / wrong: https://github.com/golang/vulndb/issues/2952

CVE-2024-28180

https://github.com/kubernetes/kubernetes/issues/123252

Confirmed by maintainers that this does not have security implications: https://github.com/kubernetes/kubernetes/pull/123649#issuecomment-1975021556

release-1.30 branch results

openvex-1.30.out.txt

govulncheck -scan package -format openvex ./... > openvex-1.30.out

Only 3 out 154 CVEs were marked as affected by govulncheck@v1.1.2 🎉

2 out of 3 were identical to above

For the last remaining one found the following information:

CVE-2023-47108

Discussed and Confirmed by maintainers that this does not affect kubernetes in https://github.com/kubernetes/kubernetes/pull/121842 and https://github.com/kubernetes/apiserver/issues/106#issuecomment-1952789941

Next Steps

As always open to suggestions and more feedback as we are in early stages of this discussion!

PushkarJ commented 4 months ago

/cc @MadhavJivrajani @dims @puerco @jeremyrickard @sftim

(For more input from SIG Architecture, Release and Docs)

dims commented 4 months ago

+1 for the idea with my sig-arch hat on.

puerco commented 4 months ago

+1 with my releng hat on!

@PushkarJ: I added some views on govulncheck about the missing product, I think we can fix it by sending a couple of patches to govulncheck.

Also, I read your comment there about the multiple products. Don't worry about it, once we have the source openvex document, we can enhance it to include the rest of the release artifacts by pulling in data from the SBOMs.

This is super exciting :tada:

puerco commented 4 months ago

@ritazh we should also sign them and attach them to container images. Grype will soon autodiscover them there. We are also working on a specification to host the vex data in the repositories (see https://github.com/openvex/spec/issues/46 )

ritazh commented 1 month ago

xpost this from @jeremyrickard

We did not bump the go dependency prior to our most recent patch releases because upon review these CVEs are not present in the way we use Go. Trivy unfortunately does not do real analysis and this report is incorrect.

and this from @BenTheElder

SRC == https://github.com/kubernetes/committee-security-response SRC: Should we start publishing VEX? SRC: Who will maintain and verify them? If we start publishing these we want to be sure that they're correct.

Would like to hear from other SRC members. cc @kubernetes/security-response-committee

tabbysable commented 1 week ago

@puerco how are the release team's plans for using OpenVEX in the release process looking? We've got govulncheck 1.1.2 in main now, but the PRs for upgrading the 1.29 and 1.30 branches are stalled in review.

If you're waiting for this upgrade so you can move forward with development, please chime in on those issues to help get them unblocked.

BenTheElder commented 1 week ago

Would like to hear from other SRC members. cc @kubernetes/security-response-committee

@ritazh did we ever get a consensus from SRC? (also pinged the alias on the linked comment previously)

joelsmith commented 1 week ago

I think publishing VEX documents is a great idea. 👍 from me

ritazh commented 1 week ago

Thanks @joelsmith!

From all SRC members I talked to, everyone was ok with:

SRC: Should we start publishing VEX?

Would be good to get explicit 👍 from @kubernetes/security-response-committee regarding:

SRC: Who will maintain and verify them? If we start publishing these we want to be sure that they're correct.

As that is a new responsibility for the SRC and currently I believe @liggitt and @dims usually decide if a dependency impacts k8s or not.

dims commented 1 week ago

usually decide if a dependency impacts k8s or not.

in consultation with the folks who know better than us :)

joelsmith commented 1 week ago

I don't want to speak for the whole SRC, but I am willing to help do it as part of the SRC. However, reporting that we are not affected isn't necessarily something that the SRC needs to do (like triage of non-public reports is) so if other trusted community members want to own it, it's certainly a possibility in my view. For example, we could consider checking to see if anyone within sig-security is willing to do it. As of now sig-security doesn't own repos or triage issues (unless I am mistaken). Perhaps they would be interested in owning a repo containing VEX docs and similar vuln metadata and handling the issues related to creating or updating them.

knqyf263 commented 1 week ago

I am a member who maintains the Kubernetes CVE feed, and if I am not mistaken (sorry, I'm not familiar with the K8s team structure), I'm also a member of sig-security.

As a vulnerability scanner developer, I would be happy to have Kubernetes VEX available anyway to reduce noise, but if I can help as a member of sig-security, I will do as much as I can.

FYI: We're trying to collect VEX documents and make them available in our scanner. For example, Rancher published their own VEX repository. It would be great if K8s could do something similar.

BenTheElder commented 1 week ago

I don't want to speak for the whole SRC, but I am willing to help do it as part of the SRC. However, reporting that we are not affected isn't necessarily something that the SRC needs to do (like triage of non-public reports is) so if other trusted community members want to own it, it's certainly a possibility in my view. For example, we could consider checking to see if anyone within sig-security is willing to do it. As of now sig-security doesn't own repos or triage issues (unless I am mistaken). Perhaps they would be interested in owning a repo containing VEX docs and similar vuln metadata and handling the issues related to creating or updating them.

The first important part to me is less "who could possibly staff this" and more "who are we willing to grant that authority to", because it will be more formal than maintainers commenting on issues if we have a machine readable official artifact, and we need to be agreed who / what is permitted to make that decision. Then we have to make sure that we can staff that effort.

If we think automated VEX documents can offload reachability analysis to the project's infrastructure and processes from scanners that fail to do this, I guess maybe that's better than not doing it, but it continues to offload compliance and working around scanner inaccuracy onto the project.

As a vulnerability scanner developer, I would be happy to have Kubernetes VEX available anyway to reduce noise, but if I can help as a member of sig-security, I will do as much as I can.

@knqyf263 is there a reason we can't push whatever we'd generate into the VEX document off onto the scanner implementation?

If the reason is non automated identification, then yes, see discussion above about who we can decide to delegate that permission to, I think we'll figure that out. But if the reason is automated refutation of CVEs, I really think the project itself shouldn't have to spend time and resources on this versus discouraging use of inaccurate scanning ... that feels unfair and less sustainable.

BenTheElder commented 1 week ago

The VEX document is auto-generated The VEX document can be modified if needed by a small trusted list of k8s org members The VEX document can not be tampered The VEX document generation or modification supports non-repudiation The VEX document can be version controlled with git

We don't currently have the tooling to meet all of these requirements. Our git repos are not tamper-evident and doing this is not easy. There are a number of accounts that own the github organizations and can, in theory, rewrite the git history in any repo. We rely on trust.

Also, in general automated commits are a pain point, because we have to manage a credential with access to push, and robot accounts are not easy to create anymore, but the existing robot accounts are not well-isolated and we do not want to grant more things access to those accounts and increase the vulnerability surface area.

Maybe we can run a pipeline that combines the input of a git repo (where we rely on trust) to PR manual configuration / overrides in conjunction with emitting a generated artifact to ... somewhere.

This would need collaboration between SIG Security, Docs, Architecture and Release.

Are you forseeing this as a mutated release artifact? We currently heavily cache release artifacts and they are all immutable...

SIG K8s Infra will need to be engaged with the hosting requirements. Wherever possible we want to avoid standing up more end-user-bandwidth consuming endpoints as they're a ~uncontrollable cost commitment, but maybe we can stick this under dl.k8s.io or something, we need to understand more about the requirements there.

If/when we get to that point, please file a request to https://github.com/kubernetes/k8s.io for the infrastructure requirements so we can sort that out.

knqyf263 commented 1 week ago

@knqyf263 is there a reason we can't push whatever we'd generate into the VEX document off onto the scanner implementation?

@BenTheElder First of all, I think we should consider pre-build and post-build scanning separately. Post-build binaries lack reachability context, but VEX documents generated from pre-build analysis with source code could provide this missing context for more accurate scanning.

Second, source code analysis is computationally expensive. Having scanners read a pre-generated VEX document is more efficient than performing deep analysis each time.

Providing VEX documents regardless of scanner implementation would be helpful for the above two reasons. If there are ways to accurately and efficiently evaluate reachability from built binaries, VEX documents might be unnecessary - I'd appreciate if you could share such methods if they exist.

Lastly, regarding authority, VEX documents verified by Kubernetes maintainers would be more reliable than third-party generated ones. While automation would be ideal, Kubernetes maintainers are already confirming "not affected" status for various vulnerabilities in GitHub Issues - distributing these confirmations in a machine-readable format would be valuable.

By the way, SAST tools are not always accurate. Maintainer verification would make VEX documents more reliable, which is another reason why having Kubernetes project issue VEX would be beneficial. If VEX documents are just generated by tools like govulncheck, third parties could generate the same results, so the Kubernetes team might not need to manage them. Having said that, setting up SAST tools on K8s projects is not easy. I still think it would help to have official VEX.

macedogm commented 1 week ago

I'm from the Rancher Security team and we implemented 100% automated VEX scans and report generation with govulncheck and vexctl. Everything is pushed to our VEX Hub in rancher/vexhub, which is then feed into our scanners to remove the false-positives. This allows our developers to focus on the real critical vulnerabilities. Plus it helps to provide more accurate data about CVEs in our own images.

We also do manual VEX when we check that the respective upstrea projects confirmed a CVE as false-positive, for example, and then double check with govulncheck. In the Rancher specific context, we did some manual VEX for upstream K8s CVEs in https://github.com/rancher/vexhub/tree/main/pkg/golang/k8s.io/kubernetes and have a history of everything in https://github.com/rancher/vexhub/blob/main/reports/vex_cves.csv.

Speaking from the fact that I work daily with triaging such CVEs and answer customers' inquiries, I truly believe that VEX is the way forward in regards to removing the false-positive CVE noise.

I really think the project itself shouldn't have to spend time and resources on this versus discouraging use of inaccurate scanning ... that feels unfair and less sustainable.

I personally don't believe that this current "CVE madness" in our industry is the fault of the vulnerability scanners, it's quite the opposite. The way that the CVEs are assigned, their lack of context around the exact affected paths and functions and the way that programming languages and their tooling handle dependencies would be the main root cause IMHO.

Doing VEX is currently a relatively easy way to better improve the CVE applicability for projects, images etc. and to better provide more applicable vulnerability data for users.

If K8s would to provide their own VEX reports, I greatly see users and downstream projects (like Rancher, for example), having more accurate data about which CVEs are really applicable.

Having said that, I kindly offer my assistance to help set up and maintain automation in K8s to generate its own VEX reports.

BenTheElder commented 1 week ago

@BenTheElder First of all, I think we should consider pre-build and post-build scanning separately. Post-build binaries lack reachability context, but VEX documents generated from pre-build analysis with source code could provide this missing context for more accurate scanning.

This is a good point, thank you.

Second, source code analysis is computationally expensive. Having scanners read a pre-generated VEX document is more efficient than performing deep analysis each time.

Some better understanding of quite how expensive would be helpful. Our release process is already very expensive and we're looking at e.g. dropping ~unused CPU architectures for this and other reasons.

Kubernetes maintainers are already https://github.com/kubernetes/kubernetes/issues/122457#issuecomment-1868093606 for various vulnerabilities in GitHub Issues - distributing these confirmations in a machine-readable format would be valuable.

See discussion above about manual vs automated and official vs not so official. One of us commenting on the issue is NOT the same thing as hosting a document. That said I agree that it would make sense to do so, it just needs more details and clear ownership/delegation. We can get away without that while it's not so official, now we can't and need to resolve that part.

By the way, SAST tools are https://github.com/golang/go/issues/69446. Maintainer verification would make VEX documents more reliable, which is another reason why having Kubernetes project issue VEX would be beneficial. If VEX documents are just generated by tools like govulncheck, third parties could generate the same results, so the Kubernetes team might not need to manage them. Having said that, setting up SAST tools on K8s projects is not easy. I still think it would help to have official VEX.

Remember that we have limited bandwidth for that though, and folks currently commenting are a) best effort, b) generally not official, c) require no additional resources beyond github.


So from the k8s infra side: 1 We need to understand how expensive is "expensive" computationally 2 We need to understand what it looks like to actually host these. Are the files large? Are they mutated after release?

EDIT: we can take 2) to github.com/kubernetes/k8s.io later, just providing a heads up that reading between the lines we might have fun with that part. content hosting has been a a constant footgun. If it's just an ~immutable release artifact we are good to go though as long as sig release is willing to handle that.

From the SRC / steering committee side: 3 For an manual assertions that will be positioned as an official artifact, we need to decide who will own that and make sure SRC is fine with that. Comments on github issues are not the same thing. Anyone can do that.

Hosting automation is something we do as regular business. Hosting non-immutable artifacts would be new, hosting things that write to repos is something we generally avoid. Hosting something that has to run in the release process and is potentially computationally expensive is something that needs weighing.

knqyf263 commented 6 days ago

Our release process is already very expensive

Unlike SBOMs, VEX generation doesn't necessarily need to be integrated into the release process since vulnerability disclosures don't align with release timing. For example, a vulnerability in a dependency like protobuf used by Kubernetes might be discovered the day after a Kubernetes release.

In such cases, vulnerability reachability analysis should be performed independently of the release process. Ideally, this analysis should cover all supported versions (the latest three minor versions) because a vulnerability might affect 1.30 but not 1.31, or vice versa. Therefore, VEX documents should be updated when vulnerabilities are discovered, regardless of release timing. While this is the ideal scenario, starting with a more manageable scope would still be valuable, such as analyzing reachability only for the latest version or updating VEX documents monthly.

I view VEX as a living document that needs to be continuously updated, not an immutable artifact, given that vulnerabilities are discovered and analyzed on an ongoing basis. However, since VEX documents are mergeable, it's possible to treat individual VEX files as immutable through careful design. We could implement a strategy where new findings are added as new files rather than modifying existing ones.

One of us commenting on the issue is NOT the same thing as hosting a document.

I see VEX as documenting decisions not to patch vulnerabilities rather than just informal comments. While anyone can comment on issues freely, the decision to leave a vulnerability unpatched is not trivial, and I believe maintainers make such decisions with substantial justification. That said, I understand that official documentation carries different weight and responsibilities compared to informal comments.

Remember that we have limited bandwidth for that though, and folks currently commenting are a) best effort, b) generally not official, c) require no additional resources beyond github.

Yes, I understand, and I don't think maintainers need to review every vulnerability. The current best-effort approach of confirming non-applicability for certain vulnerabilities is already valuable. Also, similar to what @macedogm is doing at Rancher, generating VEX documents based on SAST tool results could help reduce the burden of publishing official documentation.