solo-io / go-utils

golang utilities
Apache License 2.0
111 stars 19 forks source link

Trivy Scanning: Scan all images and only create issue for latest patch release #485

Closed sam-heilbron closed 2 years ago

sam-heilbron commented 2 years ago

Description

Context

Fix Trivy scan bug

In https://github.com/solo-io/go-utils/issues/478, we identified that our historical scan implementation would create a Github issue for every version scanned that contained a vulnerability, and that this was extremely noisy to developers.

As a result, we added a slight adjustment to the code in https://github.com/solo-io/go-utils/pull/479. The unintended side effect of this was in addition to only creating github issues only for the latest patch release, we also only scanned the latest patch release. This speeds up our weekly job tremendously, but it also no longer produces results for older versions that users might be on. Since vulnerabilities might be identified after that fact, we want to make sure we're scanning all versions, since there might be a new vulnerability identified that we want to make users aware of.

Expose new flag to only create issues for latest patch releases

We expose a new flag CreateGithubIssueForLatestPatchVersion, which when enabled will ensure that when a vulnerability is found for a given version, a github issue will be created only if that version is the latest patch release for that minor version.

Move code into distinct files

Though these changes are somewhat noisy, and perhaps distract from the PR, I think they are valuable. Reading through the code, I found it challenging at times to distinguish the different components in our scanner. I copied tightly coupled code (trivy, github, predicates) into their own files so that the main scan file can have only the main scanning code.

Stats logging follow-up

In https://github.com/solo-io/go-utils/pull/487 we added improvement to our stats utility, so that our logging endpoint handler does not throw a NPE. However, the goroutine responsible for waiting for a context to be cancelled and then initiating a server shutdown, was using the same context to shutdown the server, that it was using to determine if the server should be shutdown. This resulted in inconsistent behavior, and the server not being shutdown properly, which would lead to flakes.

This PR added a new context to be used for shutdown.

Testing

Testing the security scanner end-to-end is challenging. A lot of the code relies on a github.Client that cannot be mocked. Therefore for this change I propose the following

  1. I added unit tests to cover the case I am trying to solve, that we only create issues for the latest patch release
  2. After merging this, I will upgrade the dependency in Gloo, and run the job and manually verify the results.

Manual Testing

I use https://github.com/solo-io/gloo/pull/6709 to manually verify these changes. In that PR, I outline the steps required to run the job locally and ways to verify both that (A) we scan all releases and (B) only generate github issues for the latest patch.

BOT NOTES: resolves https://github.com/solo-io/go-utils/issues/478

nfuden commented 2 years ago

As far as I can tell this is mainly addressing adding a repofilter that is applied only on the issue creation step which is a pretty easy update. Am I missing something else?

sam-heilbron commented 2 years ago

As far as I can tell this is mainly addressing adding a repofilter that is applied only on the issue creation step which is a pretty easy update. Am I missing something else?

Yup! That's it. I thought that was the set of changes we were trying to support

solo-changelog-bot[bot] commented 2 years ago

Issues linked to changelog: https://github.com/solo-io/go-utils/issues/469 https://github.com/solo-io/go-utils/issues/478