One reason to consider switching back to `ubuntu` base images

consideRatio commented 2 years ago

Update

Security vulnerabilities reported in debian based images doesn't seem less safe than ubuntu. They both have security overviews, and the key difference is how security scanners like trivy interpret these. See https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2918#issuecomment-1295813128

Original topic

The hub pod's base image is as of #2733 python:3.9-slim-bullseye, which is based on debian:bullseye-slim. Previously we were based on ubuntu.

I've observed that debian:bullseye-slim seem to come with a lot of unfixed known vulnerabilities, while ubuntu:22.04 don't. That makes me consider if we should go for ubuntu:22.04 + installing python ourselves again. I think maybe that isn't well enough motivated by this yet though, but if there are other reasons coming up, maybe we should go back to ubuntu.

A comment on the amount of security vulnerabilities in the hub image: https://github.com/jupyterhub/zero-to-jupyterhub-k8s/issues/2586#issuecomment-1290389415
An overview of the current known vulnerabilities in our hub pod: https://artifacthub.io/packages/helm/jupyterhub/jupyterhub?modal=security-report

betatim commented 2 years ago

Do you know which sha/hash (not quite sure what to call it) of the base image (python:3.9-slim-bullseye) is being used to build the hub image? The reason I'm asking is that the tag (3.9-slim-bullseye) gets rebuilt periodically on docker hub. So maybe that newer build would fix more vulnerabilities? However docker build ... will not fetch the newest build if it already has an image for the same tag. For a work project we switched to using docker build --pull --no-cache ... so that each time you build the image you fetch the latest available base image.

I don't know how image caching in GH Actions works. Is it shared between our builds, what is available on the node our job runs, no sharing at all? Hence the idea of looking at the SHA that is being used and comparing it to the latest available one (compared to when the hub image was built).

consideRatio commented 2 years ago

Do you know which sha/hash (not quite sure what to call it) of the base image (python:3.9-slim-bullseye) is being used to build the hub image? The reason I'm asking is that the tag (3.9-slim-bullseye) gets rebuilt periodically on docker hub. So maybe that newer build would fix more vulnerabilities?

I've tested with the absolute latest images using trivy locally on my computer! Its also not the python image that is the problem, it is the image they rely on debian, independent if its the slim or not slim version!

So my conclusion atm is that debian is slower to provide patches to their apt packages than ubuntu is, or, that debian known vulnerabilities are more discoverable making it an unfair comparison or similar.

consideRatio commented 2 years ago

Update

@tianon provided a great comment in https://github.com/docker-library/python/issues/708#issuecomment-1295196071 about how trivy and other container scanners pick up or treat declared intent by ubuntu compared to debian.

For some added insight, the primary reason I've found over the years that CVE scanners are less kind to Debian than they are to Ubuntu is that Ubuntu clearly marks CVEs as "WONTFIX" (which the scanners than interpret as "hide this CVE"), where the Debian Security Team never says outright that something will not be fixed because any Debian Developer / member of the project (or someone with enough time to do the work and find a sponsor for it) can fix any issue they like.

For example, the maintainer of a given package could choose to fix every CVE filed against their packages in stable, if they so chose, even if the Security Team at large has decided that most of them aren't worth fixing in stable. The way the Security Team normally telegraphs their evaluation of the CVE's severity within the context of the Debian package and how it is used is with a "no-dsa" tag in the notes, which IMO all the scanning vendors should pick up and interpret the same as they do "WONTFIX" in Ubuntu (because it really is the exact same meaning), but I have not seen any of them doing so, even in spite of the many long meetings I've had with several of them describing this. :disappointed:

Case example

I looked into CVE-2005-2541. It is reported by trivy as a container scanner to be a problem in debian but not ubuntu.

trivy ubuntu:22.04 | grep CVE-2005-2541
# no output

trivy debian:11 | grep CVE-2005-2541
| tar              | CVE-2005-2541    |          | 1.34+dfsg-1       |               | tar: does not properly warn the user    |

Still, both debian and ubuntu behaves the same when tested practically, so the difference must be in what trivy is reporting about it.

Expand me to see bash script used as practical test

```bash echo echo "------------------------------------------------------------------------" echo "This script investigates if CVE-2005-2541 is addressed differently" echo "between debian:11 and ubuntu:22.04, as that helps when considering when" echo "container scanners report it to be fixed in ubuntu but not fixed in" echo "debian." echo echo "ref: https://www.cvedetails.com/cve/CVE-2005-2541/" echo "ref: https://security-tracker.debian.org/tracker/CVE-2005-2541" echo echo "\$ tar --help | grep -A2 -- --preserve-permissions" echo echo " -p, --preserve-permissions, --same-permissions" echo " extract information about file permissions" echo " (default for superuser)" echo "------------------------------------------------------------------------" echo echo echo # go to a safe location mkdir -p /tmp/docker-mounted-dir cd /tmp/docker-mounted-dir # dump test script to be run in containers cat < test cd /tmp # setup mkdir -p tarme touch tarme/setuid-file chmod u+s tarme/setuid-file tar -cf archive.tar tarme rm -rf tarme # test echo "ls -l after unpacking with/without -p flag" tar -pxf archive.tar tarme ls -l tarme/setuid-file rm -rf tarme tar -xf archive.tar tarme ls -l tarme/setuid-file rm -rf tarme # cleanup rm -rf tarme rm archive.tar EOT echo "----------------------------" echo "tar -pxf and tar -xf as root" echo "----------------------------" echo echo "debian:11" docker run -it --rm -v $(pwd):$(pwd) debian:11 bash $(pwd)/test echo echo "ubuntu:22.04" docker run -it --rm -v $(pwd):$(pwd) ubuntu:22.04 bash $(pwd)/test echo echo "------------------------------" echo "tar -pxf and tar -xf as nobody" echo "------------------------------" echo echo "debian:11" docker run -it --rm -v $(pwd):$(pwd) --user=nobody debian:11 bash $(pwd)/test echo echo "ubuntu:22.04" docker run -it --rm -v $(pwd):$(pwd) --user=nobody ubuntu:22.04 bash $(pwd)/test echo ```

------------------------------------------------------------------------
This script investigates if CVE-2005-2541 is addressed differently
between debian:11 and ubuntu:22.04, as that helps when considering when
container scanners report it to be fixed in ubuntu but not fixed in
debian.

ref: https://www.cvedetails.com/cve/CVE-2005-2541/
ref: https://security-tracker.debian.org/tracker/CVE-2005-2541

$ tar --help | grep -A2 -- --preserve-permissions

  -p, --preserve-permissions, --same-permissions
                             extract information about file permissions
                             (default for superuser)
------------------------------------------------------------------------

----------------------------
tar -pxf and tar -xf as root
----------------------------

debian:11
ls -l after unpacking with/without -p flag
-rwSr--r-- 1 root root 0 Oct 29 11:27 tarme/setuid-file
-rwSr--r-- 1 root root 0 Oct 29 11:27 tarme/setuid-file

ubuntu:22.04
ls -l after unpacking with/without -p flag
-rwSr--r-- 1 root root 0 Oct 29 11:27 tarme/setuid-file
-rwSr--r-- 1 root root 0 Oct 29 11:27 tarme/setuid-file

------------------------------
tar -pxf and tar -xf as nobody
------------------------------

debian:11
ls -l after unpacking with/without -p flag
-rwSr--r-- 1 nobody nogroup 0 Oct 29 11:27 tarme/setuid-file
-rw-r--r-- 1 nobody nogroup 0 Oct 29 11:27 tarme/setuid-file

ubuntu:22.04
ls -l after unpacking with/without -p flag
-rwSr--r-- 1 nobody nogroup 0 Oct 29 11:27 tarme/setuid-file
-rw-r--r-- 1 nobody nogroup 0 Oct 29 11:27 tarme/setuid-file

Conclusion

This seems to be a trivy / debian issue - trivy for reporting this inconsistently, and possibly debian for not perhaps not making it easy enough for trivy to draw the right conclusions. I'd love to track this upstream, but I don't know where yet. @tianon, do you know if this situation is being tracked by trivy or debian somewhere?

I'd very much like to resolve this upstream or at least have a reference to an upstream disucssion about resolving it before working around it by switching to ubuntu.

consideRatio commented 2 years ago

I think https://github.com/aquasecurity/trivy-db (EDIT: no probably https://github.com/aquasecurity/vuln-list) may be a place to create an issue. I could not find an issue there about it. I saw a trace about no-dsa in their source code.

Using trivy image --severity=HIGH --format json debian:11.5 I fetched this part about the CVE.

      {
        "VulnerabilityID": "CVE-2005-2541",
        "PkgName": "tar",
        "InstalledVersion": "1.34+dfsg-1",
        "Layer": {
          "Digest": "sha256:17c9e6141fdb3387e5a1c07d4f9b6a05ac1498e96029fa3ea55470d4504f7770",
          "DiffID": "sha256:d9d07d703dd5ba0b8e23bf7e1bd9f7e4093418a58dc9e470ca013d1c3a1b5bb5"
        },
        "SeveritySource": "nvd",
        "PrimaryURL": "https://avd.aquasec.com/nvd/cve-2005-2541",
        "Title": "tar: does not properly warn the user when extracting setuid or setgid files",
        "Description": "Tar 1.15.1 does not properly warn the user when extracting setuid or setgid files, which may allow local users or remote attackers to gain privileges.",
        "Severity": "HIGH",
        "CVSS": {
          "nvd": {
            "V2Vector": "AV:N/AC:L/Au:N/C:C/I:C/A:C",
            "V2Score": 10
          },
          "redhat": {
            "V3Vector": "CVSS:3.1/AV:L/AC:H/PR:N/UI:R/S:U/C:H/I:H/A:H",
            "V3Score": 7
          }
        },
        "References": [
          "http://marc.info/?l=bugtraq\u0026m=112327628230258\u0026w=2",
          "https://lists.apache.org/thread.html/rc713534b10f9daeee2e0990239fa407e2118e4aa9e88a7041177497c@%3Cissues.guacamole.apache.org%3E"
        ],
        "PublishedDate": "2005-08-10T04:00:00Z",
        "LastModifiedDate": "2021-06-18T15:15:00Z"
      }

Aqua security's website lists it as 0/10 score issue for Ubuntu, but don't mention Debian: https://avd.aquasec.com/nvd/2005/cve-2005-2541/

Debian doesn't note https://security-tracker.debian.org/tracker/CVE-2005-2541 to be no-dsa though, while they do for the following for example: https://security-tracker.debian.org/tracker/CVE-2021-38370. Hmmm...

Overall, I can't conclude clearly what could be done or what makes sense to do =/

consideRatio commented 2 years ago

Looking at https://github.com/aquasecurity/vuln-list#source, it seems the issue is that trivy has a source from ubuntu that omits CVE-2005-2541, while it has a source from debian that includes it. Hmmm...

Debian: https://security-tracker.debian.org/tracker/CVE-2005-2541 Ubuntu: https://ubuntu.com/security/cves?q=CVE-2005-2541

Looking at another newer CVE, it seems that ubuntu reporting it as "needs triage" makes it not show up for ubuntu users while it does for debian who concludes them to be vulnerable.

https://ubuntu.com/security/CVE-2021-46848 https://security-tracker.debian.org/tracker/CVE-2021-46848

Opinion evolution

It seems that debian is more vulnerable according to trivy based on providing more information. I'm inclined to say we should stick with debian and not go back to ubuntu.

Maybe artifacthub.io etc could convey more strongly a score related to what is fixable, and not judge too harshly by what is not fixable? Arrrgh such a tricky topic.

manics commented 2 years ago

My view on this:

We use a well known base image that's generally accepted for use in large enterprises and fully supported by upstream
- "accepted for use in large enterprises" is a bit vague, but hopefully the intent is clear- Linux distributions that most businesses are happy to run. Debian and Ubuntu 20.04/22.04 both fall within this.
- We effectively delegate the decision on whether packages are safe to the distribution maintainers. For an enterprise/equivalent grade distribution this is a reasonable assumption
- The exception is if the distribution says a package is vulnerable, cannot be updated, or we have information to the contrary
We add a minimal hub image, and make a best effort to minimise reported security vulnerabilities in it. This immediately reduces the number of vulnerabilities we have to consider, with a side benefit of it being easier for organisations to audit the image if necessary.
If a dependency is under our control (e.g. Python packages) we upgrade or find a replacement if feasible
If the dependency is in the Python distribution we upgrade Python
If the reported vulnerability is in a distribution package and it's in the full image we ignore it
Optionally: If the reported vulnerability is in a distribution package and it's in the minimal hub image we investigate, and if it's not relevant we add a comment to the Dockerfile (or somewhere else)

We could add something along these lines to our docs if we agree a policy?

betatim commented 2 years ago

Thanks a lot for doing so much digging Erik!! I learnt a lot!

I think I agree with the idea to not switch and maybe to add a note to the docs explaining the reasoning. It seems like this knowledge about how to interpret/ignore the results of automated scanners is not widespread and pretty subtle. After all it took you a while to find all the info.

And then investing our human effort in to making sure we always use the latest build of the base image, and other things that increase user's security instead of chasing the numbers based on what a robot says. It would be nice if there was a better signal to noise ratio but as long as it is like this it seems the best thing to do is not pay too much attention to it and keep an eye out for legit CVEs and getting the patches into the images repo2docker builds.

consideRatio commented 2 years ago

I'll go for a close here as I conclude no action at this point, thinking that for now that we stick with debian based images. Thanks for brainstorming about this everyone!!

consideRatio commented 2 years ago

Oh, wait, regarding a policy.

We could add something along these lines to our docs if we agree a policy?

If we have a policy, I figure it should be as an inline comment only in the Dockerfile. That is where people may look and consider adding patches so its better its written there directly.

I'd support a policy that doesn't add to what we currently do, because I think what we do is sufficient.

We do apt-get upgrade to ensure we have the latest apt packages so that if the base image is a bit outdated, we don't have to be.
We refreeze our requirements.txt before each release to have new versions
We provide development releases whenever a security vulnerability was patched by rebuilding the image
We use a well known base image, python, based on well known debian, acceptable security wise to many where security is crtical.

jupyterhub / zero-to-jupyterhub-k8s