quay / claircore

foundation modules for scanning container packages and reporting vulnerabilities
https://quay.github.io/claircore/
Apache License 2.0
144 stars 85 forks source link

Debian images being given Ubuntu distribution in Scan Report #969

Open iainduncani opened 1 year ago

iainduncani commented 1 year ago

The Problem

We have some Debian test images such and they occasionally report their IndexReport.Distribution as Ubuntu but with some information from the Debian release such as:

{
  "manifest_hash": "sha256:6d65850246779247824388550b953c3252333f641f5f646849d132963639cea4",
  "packages": {},
  "distributions": {
    "7961": {
      "id": "7961",
      "did": "ubuntu",
      "name": "Ubuntu",
      "version": "11 (Bullseye)",
      "version_code_name": "bullseye",
      "version_id": "11",
      "arch": "",
      "cpe": "",
      "pretty_name": "Ubuntu 11"
    }
  },
  "repository": {},
  "environments": {},
  "vulnerabilities": {},
  "package_vulnerabilities": {},
  "enrichments": {}
}

We run Clair in a few different environments and sometimes this image will correctly scan with a distribution did of debian. Furthermore, looking into the dist table we see there are lots of Debian releases but labelled as Ubuntu.

SELECT name, version, version_code_name, version_id, pretty_name FROM public.dist
WHERE name = 'Ubuntu'
ORDER BY id ASC LIMIT 100
name version version_code_name version_id pretty_name
Ubuntu 18.04.3 LTS (Bionic Beaver) bionic 18.04 Ubuntu 18.04.3 LTS
Ubuntu 20.04 LTS (Focal Fossa) focal 20.04 Ubuntu 20.04 LTS
Ubuntu 16.04.6 LTS (Xenial Xerus) xenial 16.04 Ubuntu 16.04.6 LTS
Ubuntu 19.04 (Disco Dingo) disco 19.04 Ubuntu 19.04
Ubuntu 18.04 (Bionic) bionic 18.04 Ubuntu 18.04
Ubuntu 19.04 (Disco) disco 19.04 Ubuntu 19.04
Ubuntu 11 (Bullseye) bullseye 11 Ubuntu 11
Ubuntu 22.04 (Jammy) jammy 22.04 Ubuntu 22.04
Ubuntu 20.04 (Focal) focal 20.04 Ubuntu 20.04
Ubuntu 9 (Stretch) stretch 9 Ubuntu 9
Ubuntu 16.04 (Xenial) xenial 16.04 Ubuntu 16.04
Ubuntu 10 (Buster) buster 10 Ubuntu 10
Ubuntu 12 (Bookworm) bookworm 12 Ubuntu 12

Version 9, 10, 11 and 12 are all Debian releases.

The Cause

Looking at the code I think this is caused by the fact that both the Debian and Ubuntu DistributionScanner just look for a etc/os-release file. If that file has an VERSION_ID and VERSION_CODENAME field then they will match and store their dist in the DB linked to this layer. Both hard code the did to either ubuntu or debian when they do so. When the scanner has finished the distribution they find get written to the database by the layerscanner explaining why we have lots of dist table entries for Ubuntu on Debian images. (note we don't have dist entries the other way for Ubuntu images marked as Debian because the Debian scanner also tries to conver the version to an int which fails for Ubuntu images)

Looking at the etc/os-release file for our images (we've seen this with both debian:stable-slim and gcr.io/distroless/base based images) we can see that it will meet the criteria for both scanners:

PRETTY_NAME="Debian GNU/Linux 11 (bullseye)"
NAME="Debian GNU/Linux"
VERSION_ID="11"
VERSION="11 (bullseye)"
VERSION_CODENAME=bullseye
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

As predicted when this gets to the dist_scanartifact table we have two links to the dist table, one for Ubuntu and one for Debian:

psql -c "SELECT * FROM dist d
INNER JOIN dist_scanartifact dsa
ON dsa.dist_id=d.id
INNER JOIN manifest_layer ml
ON ml.layer_id=dsa.layer_id
INNER JOIN manifest m
ON m.id=ml.manifest_id
WHERE m.hash='sha256:6d65850246779247824388550b953c3252333f641f5f646849d132963639cea4';"
  id  |       name       |  did   |    version    | version_code_name | version_id | arch | cpe |          pretty_name           | dist_id | scanner_id | layer_id | i | manifest_id | layer_id |                                  hash                                   |   id
------+------------------+--------+---------------+-------------------+------------+------+-----+--------------------------------+---------+------------+----------+---+-------------+----------+-------------------------------------------------------------------------+---------
    6 | Debian GNU/Linux | debian | 11 (bullseye) | bullseye          | 11         |      |     | Debian GNU/Linux 11 (bullseye) |       6 |         87 | 56283318 | 0 |     3702805 | 56283318 | sha256:6d65850246779247824388550b953c3252333f641f5f646849d132963639cea4 | 3702805
 7961 | Ubuntu           | ubuntu | 11 (Bullseye) | bullseye          | 11         |      |     | Ubuntu 11                      |    7961 |         88 | 56283318 | 0 |     3702805 | 56283318 | sha256:6d65850246779247824388550b953c3252333f641f5f646849d132963639cea4 | 3702805
(2 rows)

After the two distributions have been linked to the layer and the layer scanning phase has completed the Controller goes into the coalesce phase. Here the two distributions will be loaded using the DistributionsByLayer query. The coalescer for dpkg will then be asked to coalesce this data. The dpkg ecosystem uses the Linux Coalescer to do this. The Linux Coalescer only allows a single dist per layer:

func NewDistSearcher(artifacts []*indexer.LayerArtifacts) DistSearcher {
    dists := make([]*claircore.Distribution, len(artifacts))

    // record where dists show up in layers via slice
    for i, artifact := range artifacts {
        if len(artifact.Dist) > 0 {
            dists[i] = artifact.Dist[0] // we dont support multiple dists found in a layer
        }
    }
    return DistSearcher{dists}
}

So at this point the two distributions we see in the database will be collapsed into one that we see in the ScanReport. I think that the one that is picked will be semi-random based on the order that is returned from the DistributionsByLayer query. One thing that confuses me though is that we don't see this problem closer to 50% of the time because that is an unordered query but we do seem to get the right distribution for both Ubuntu and Debian images the majority of the time even though they have two distributions linked in the dist_scanartifact table, I suspect it is purely because the dpkg Ecosystem returns the Debian scanner before the Ubuntu one and that PostgreSQL isn't fully random on unordered sets. Whatever the reason for it not being 50% it clearly isn't fully reliable though as we have seen this error several times.

The solution?

I noticed that the RHEL DistributionScanner also checked for the presence of files but in addition did a regex match on them for Red Hat Enterprise Linux (?:Server)?\s*(?:release)?\s*(\d+)(?:\.\d)?. Could this be added to the Ubuntu and Debian scanners to get the right answer?

iainduncani commented 1 year ago

One additional thought that one of my team members pointed out to me this morning - I think even with updating the layer scanner the links in the dist_scanartifact will persist so a change to this may require also deleting the incorrect dist_scanartifact links (maybe as part of a SQL migration? The incorrect dist values would be easy to identify due to the different version number schemes so could just delete the dist and let it cascade to other tables)

RTann commented 1 year ago

From what I can tell, I believe the Ubuntu distribution scanner may think it's Debian but not the other way around because of the following:

The Debian distribution scanner verifies the etc/os-release file is for Debian via this check:

if kv[`ID`] != `debian` {
    return nil, nil
}

The Ubuntu distribution scanner does no such check. I verified this happens by adding a test in ubuntu/distributionscanner_test.go and adding the etc/os-release for Debian 11 inside ubuntu/testdata. I saw the following test pass:

{
    Name: "11",
    Want: claircore.Distribution{
        Name:            "Ubuntu",
        DID:             "ubuntu",
        VersionID:       "11",
        PrettyName:      "Ubuntu 11",
        VersionCodeName: "bullseye",
        Version:         "11 (Bullseye)",
    },
},

We'll see an equivalent test in debian/distributionscanner_test.go will fail (as-is, it will panic due to how the test is setup)

I think the solution for this would to be to simply verify ID or DISTRIB_ID field actually represents Ubuntu.

RTann commented 1 year ago

As for potential migration, I'll defer that to someone else, but I believe as long as we update the Ubuntu distribution scanner and bump the version specified in the file, then Clair users should be able to recognize the indexer has been updated, so the index report must be regenerated. Maybe just by redoing the index report there is no need for a migration? @crozzy ?

RTann commented 1 year ago

I started a PR to attempt to resolve this: https://github.com/quay/claircore/pull/970

crozzy commented 1 year ago

Yeah, I think iterating the version of the ubuntu distribution scanner should work, the thing I'm not 100% clear on is whether we'd also need to iterate the debian distribution scanner too, but I guess it'll be clearer during PR testing

iainduncani commented 1 year ago

@RTann thanks so much for getting the fix in for this so quickly. We've just been testing this in our dev environment. One thing that we have found is that because we have layers in the database that are linked to these incorrect "Ubuntu" entries via a row in the dist_scanartifact when we rescan an image that has one of these layers because the old link is not removed it can still get the wrong result.

What would you advise is the right solution here? Our thoughts are that we will manually tidy up all of these wrong dist entries and dist_scanartifact links prior to upgrading and then when the images are rescanned they will not get added back in due to this fix and therefore there will only be one dist to find in the coalesce phase.

Do you think that our outside-of-clair approach is the correct one or would you rather something went into Clair in addition that picked the right answer when two were found. I'd wondered about making the linux.Coalescer be cleverer in picking the right dist and filter out Ubuntu versions that don't have a . in them rather than just picking [0] as it does at the moment but that feels like a bit of a hack for a problem that should not come back if we tidy the DB. Would love your opinion though.