anchore / grype

A vulnerability scanner for container images and filesystems
Apache License 2.0
8.79k stars 570 forks source link

Show layers from where a vulnerability originated #739

Open rmccarth opened 2 years ago

rmccarth commented 2 years ago

What would you like to be added:

Similar to the old anchore format, please show the layer/inheritance status of a particular layer which contains vulnerabilities.

Why is this needed:

Functionality from anchore inline scan was powerful and should continue.

Loqova commented 2 years ago

This would indeed be very useful.

spiffcs commented 2 years ago

@rmccarth thanks for filing an issue!

When I run grype -o json node:latest I get a list of vulnerabilities in that image.

Here is an example of one of those vulnerabilities:

  {
   "vulnerability": {
    "id": "CVE-2004-0971",
    "dataSource": "https://security-tracker.debian.org/tracker/CVE-2004-0971",
    "namespace": "debian:11",
    "severity": "Negligible",
    "urls": [
     "https://security-tracker.debian.org/tracker/CVE-2004-0971"
    ],
    "description": "The krb5-send-pr script in the kerberos5 (krb5) package in Trustix Secure Linux 1.5 through 2.1, and possibly other operating systems, allows local users to overwrite files via a symlink attack on temporary files.",
    "cvss": [],
    "fix": {
     "versions": [],
     "state": "not-fixed"
    },
    "advisories": []
   },
   "relatedVulnerabilities": [
    {
     "id": "CVE-2004-0971",
     "dataSource": "https://nvd.nist.gov/vuln/detail/CVE-2004-0971",
     "namespace": "nvd",
     "severity": "Low",
     "urls": [
      "http://www.securityfocus.com/bid/11289",
      "http://www.gentoo.org/security/en/glsa/glsa-200410-24.xml",
      "http://www.redhat.com/support/errata/RHSA-2005-012.html",
      "http://www.trustix.org/errata/2004/0050",
      "http://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=136304",
      "https://exchange.xforce.ibmcloud.com/vulnerabilities/17583",
      "https://oval.cisecurity.org/repository/search/definition/oval%3Aorg.mitre.oval%3Adef%3A10497",
      "https://lists.apache.org/thread.html/rc713534b10f9daeee2e0990239fa407e2118e4aa9e88a7041177497c@%3Cissues.guacamole.apache.org%3E"
     ],
     "description": "The krb5-send-pr script in the kerberos5 (krb5) package in Trustix Secure Linux 1.5 through 2.1, and possibly other operating systems, allows local users to overwrite files via a symlink attack on temporary files.",
     "cvss": [
      {
       "version": "2.0",
       "vector": "AV:L/AC:L/Au:N/C:N/I:P/A:N",
       "metrics": {
        "baseScore": 2.1,
        "exploitabilityScore": 3.9,
        "impactScore": 2.9
       },
       "vendorMetadata": {}
      }
     ]
    }
   ],
   "matchDetails": [
    {
     "type": "exact-indirect-match",
     "matcher": "dpkg-matcher",
     "searchedBy": {
      "distro": {
       "type": "debian",
       "version": "11"
      },
      "namespace": "debian:11",
      "package": {
       "name": "krb5",
       "version": "1.18.3-6+deb11u1"
      }
     },
     "found": {
      "versionConstraint": "none (deb)"
     }
    }
   ],
   "artifact": {
    "name": "krb5-multidev",
    "version": "1.18.3-6+deb11u1",
    "type": "deb",
    "locations": [
     {
      "path": "/usr/share/doc/krb5-multidev/copyright",
      "layerID": "sha256:2fbabeba902e7f7c521f478f855b738d91bd4f2435de223a89fa5f4b2369065a"
     },
     {
      "path": "/var/lib/dpkg/info/krb5-multidev:amd64.md5sums",
      "layerID": "sha256:2fbabeba902e7f7c521f478f855b738d91bd4f2435de223a89fa5f4b2369065a"
     },
     {
      "path": "/var/lib/dpkg/status",
      "layerID": "sha256:2fbabeba902e7f7c521f478f855b738d91bd4f2435de223a89fa5f4b2369065a"
     }
    ],
    "language": "",
    "licenses": [
     "GPL-2"
    ],
    "cpes": [
     "cpe:2.3:a:krb5-multidev:krb5-multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*",
     "cpe:2.3:a:krb5-multidev:krb5_multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*",
     "cpe:2.3:a:krb5_multidev:krb5-multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*",
     "cpe:2.3:a:krb5_multidev:krb5_multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*",
     "cpe:2.3:a:krb5:krb5-multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*",
     "cpe:2.3:a:krb5:krb5_multidev:1.18.3-6+deb11u1:*:*:*:*:*:*:*"
    ],
    "purl": "pkg:deb/debian/krb5-multidev@1.18.3-6+deb11u1?arch=amd64&upstream=krb5&distro=debian-11",
    "upstreams": [
     {
      "name": "krb5"
     }
    ]
   }
  }

Under matchDetails.artifact there is a section called locations. This array is a list of files associated with the vulnerable package along with their corresponding layerID.

Is this the information you're looking for?

What other layer information did you have in mind?

rmccarth commented 2 years ago

What I sorta had in mind when I posted this was a flag that showed whether or not the vulnerability originated in something like the 'FROM:' directive of the Dockerfile vs something that was from a higher layer in the image. Based on what you provided above, I could write some custom code to resolve this myself, but I wasn't sure if it was some functionality provided out of the box or not.

spiffcs commented 2 years ago

Ahh awesome!

We don't have any plans in the immediacy on mapping these layerID back to a Dockerfile.

PR are welcome if you have an idea for an improvement to syft that displays this in the way you described.

How do you see users wanting to consume this extra information? Is it just taking the layerId and corresponding it back to the different directives in the Dockerfile or is there some other detail I'm missing?

06kellyjac commented 2 years ago

I'm also interested in this. Generally I have a very simple container with 1 thing so it's obvious where the problems come from but I'm working with a big container and I have no idea which binary/file is raising the issues beyond "it's go, it's python" etc


edit:

I've been playing with this quick script here, might be of use to people

"use strict";

const fs = require("fs");

const grypeRaw = fs.readFileSync("file.json");
const grype = JSON.parse(grypeRaw);

const isDanger = (severity) => severity == "High" || severity == "Critical";
const attachDangerVuln = (arr, vuln) => isDanger(vuln.severity) && arr.push(vuln);

const summary = [];

grype.matches.forEach((match) => {
  const vulns = [];
  attachDangerVuln(vulns, match.vulnerability.severity);
  match.relatedVulnerabilities.forEach((rv) => attachDangerVuln(vulns, rv));
  if (vulns.length > 0) {
    match.artifact.locations.forEach((l) => {
      const fileName = l.path;
      const existing = summary[fileName] || [];
      summary[fileName] = existing.concat(vulns);
    });
  }
});

for (const [file, vulns] of Object.entries(summary)) {
  const cleanup = (vulns) => {
    return {
      id: vulns.id,
      severity: vulns.severity,
      dataSource: vulns.dataSource,
    };
  };
  console.log(`${file}:`, vulns.map(cleanup));
}

output:

FILENAME_HERE: [
  {
    id: 'CVE-XXXX-XXXXX',
    severity: 'High',
    dataSource: 'https://nvd.nist.gov/vuln/detail/CVE-XXXX-XXXXX'
  },
  {
    id: 'CVE-XXXX-XXXXX',
    severity: 'Critical',
    dataSource: 'https://nvd.nist.gov/vuln/detail/CVE-XXXX-XXXXX'
  }
]
wagoodman commented 9 months ago

A few thoughts / connections:

  1. as relating to a syft issue https://github.com/anchore/syft/issues/435 , being able to enumerate at least the layer in which the package first appeared in (not necessarily the last). I feel like this is a prerequisite to implementing this in order to get accurate results.

  2. as relating to the above comment https://github.com/anchore/grype/issues/739#issuecomment-1152940916 , partitioning or labeling vulnerabilities as having come from the base image or a user controlled layer. This also has some connective tissue with another syft issue https://github.com/anchore/syft/issues/15 , which would further allow for "scopes" that differentiate "user" vs "base" layers. Going this path would allow for users to be able to see "only" the vulnerabilities that pertain to user-controlled layers. The downside with this approach is that it does not address having a single document with all vulnerabilities and being able to select within that grype document which vulns came from user vs base layers (without a script).

How important is having a single grype result with both user and base vulns contained within it?