anchore / grype

A vulnerability scanner for container images and filesystems
Apache License 2.0
8.55k stars 555 forks source link

vex: Add package name to VEX product identifiers #1905

Open ferozsalam opened 4 months ago

ferozsalam commented 4 months ago

What would you like to be added:

It would be good to add the pkg.Source.Name and pkg.Source.Digest information to the matchable product identifiers when using VEX documents to filter out vulnerabilities. This would allow for VEX document filtering on locally built images, which currently doesn't work when scanning images built locally with docker build. In turn, this would help with CI workflows that use VEX documents to filter out triaged vulnerabilities in locally built images.

I have explained my investigation and reasoning in more detail below. If the change sounds reasonable (it is minor), I am happy to make it myself.

Why is this needed:

When using a VEX document to filter out vulnerabilities in Grype, the product field is used to specify which images in particular a vulnerability annotation applies to.

So for example the following VEX document:

{
  "@context": "https://openvex.dev/ns/v0.2.0",
  "@id": "https://openvex.dev/docs/public/vex-9106f1944ac87cf813b77e83bcbbf8c3d1ce7f593c6f4a5083fc64ad6e1d345b",
  "author": "Unknown Author",
  "timestamp": "2024-06-03T08:05:19.611896+05:30",
  "version": 1,
  "statements": [
    {
      "vulnerability": {
        "name": "CVE-2023-31484"
      },
      "timestamp": "2024-06-03T08:05:19.611897+05:30",
      "products": [
        {
          "@id": "pkg:oci/nginx"
        }
      ],
      "status": "fixed"
    }
  ]
}

can be used against nginx:1.27.0 (make sure you are ignoring fixed VEX statuses in your grype.yaml:

$ cat .grype.yaml
ignore:
  vex-status: fixed
$ grype nginx:1.27.0 --vex vex.json | grep CVE-2023-31484
 ✔ Vulnerability DB                [no update available]
 ✔ Loaded image                                                 nginx:1.27.0
...
 ✔ Scanned for vulnerabilities     [157 vulnerability matches]
   ├── by severity: 2 critical, 14 high, 33 medium, 4 low, 75 negligible (30 u
   └── by status:   0 fixed, 158 not-fixed, 1 ignored
$

As shown above, this works as expected (1 ignored vulnerability, where the pkg:oci/nginx matches the pURL of the container being scanned) when downloading an image from a container registry. However, when building an image locally, this doesn't work.

Dockerfile:

FROM nginx:1.27.0

USER app

then run: $ docker build -t nginx-grype-test ., and update the VEX document above to specify the new image name:

      "products": [
        {
          "@id": "pkg:oci/nginx-grype-test"
        }

However, Grype doesn't ignore the same vulnerability in this new image, despite there being essentially no difference in the images:

$ grype nginx-grype-test --vex vex.json| grep CVE-2023-31484
 ✔ Vulnerability DB                [no update available]
 ✔ Loaded image                                      nginx-grype-test:latest
...
   └── by status:   0 fixed, 158 not-fixed, 0 ignored
perl-base           5.36.0-7+deb12u1         (won't fix)  deb   CVE-2023-31484    High

Doing some digging + adding some extra debug logging, I've discovered that the generation of product identifiers to match on happens here, and that in the case of locally built images, v.RepoDigests doesn't contain any information, so there's practically no way for Grype to successfully product match on a locally built image. This breaks CI workflows, for example building and then scanning a container with VEX documents applied.

However, the pkgContext.Source.Name and pkgContext.Source.ID (the local image name and digest) are all set, so it would be trivial (and sensible) to add them to the list of matchable fields for VEX documents, which would fix the issue. productIdentifiersFromContext already has a TODO that suggests We could generate more identifiers to match better., and I think this change would help with that.

As mentioned earlier, I have a good idea of what needs to be done, so if this seems like a sensible approach I'll go ahead and implement it.

cc @puerco as the original author of the VEX functionality.

tgerla commented 3 months ago

Hi @ferozsalam, we think this makes sense, it would be great if you wanted to implement it and send in a PR. Let us know if you have any trouble or any questions, and thanks in advance!

ferozsalam commented 3 months ago

Thanks @tgerla - I'll try to look at this in the coming week.

ferozsalam commented 1 week ago

@tgerla I'm finally taking a look at this at the moment and I had a question. When using grype to scan an image from a remote registry, I note that the source.ImageMetadata field has an empty Tags array, even when a tag has been provided, for example with grype busybox:latest.

(source.ImageMetadata) {
 UserInput: (string) (len=14) "busybox:latest",
 ID: (string) (len=71) "sha256:7db2ddde018a2a56e929855445bc7f30bc83db514a23404bd465a07d2770ac5f",
 ManifestDigest: (string) (len=71) "sha256:71e065368796c7368a99a072019b9fe73e28e225ae9882430579ec49a1e46235",
 MediaType: (string) (len=42) "application/vnd.oci.image.manifest.v1+json",
 Tags: ([]string) {
 },

I was hoping to use the Tags field when generating identifiers for the VEX processor to match on, as it would significantly improve the granularity with which users could target VEX statements. Would it be fine for me to split on the UserInput string in this case, or is there another approach that you would suggest? I can't see anything else in the ImageMetadata struct that contains the tag information.