anchore / syft

CLI tool and library for generating a Software Bill of Materials from container images and filesystems
Apache License 2.0
6.13k stars 564 forks source link

Syft does not detect some software in Docker Official Images #1197

Closed captn3m0 closed 1 year ago

captn3m0 commented 2 years ago

What happened:

Docker official images are highly used across the ecosystem, but since these images involve a lot of custom source-installed software, instead of package managers, a lot of these components are undetected by Syft. These are critical foundational dependencies that are getting missed.

What you expected to happen: Syft should detect foundational packages via other means.

How to reproduce it (as minimally and precisely as possible): Run some scans on docker official images and validate whether the primary dependency is picked up. Examples:

Python

None of these detects the version of Python installed:

python:3-slim ❌

syft -q  packages library/python:3-slim|grep python
pip                     22.2.2                        python
setuptools              63.2.0                        python
wheel                   0.37.1                        python

python:3-alpine ❌

syft -q  packages library/python:3-alpine|grep python
.python-rundeps         20220907.224335   apk
pip                     22.2.2            python
setuptools              63.2.0            python
wheel                   0.37.1            python

python:2 ✔️

This one does work (Partial output)

syft -q  packages library/python:2|grep python
pip                           20.0.2                       python
python                        2.7.16-1                     deb
python-minimal                2.7.16-1                     deb
python2                       2.7.16-1                     deb
python2-minimal               2.7.16-1                     deb
python2.7                     2.7.16-2+deb10u1             deb
python2.7-minimal             2.7.16-2+deb10u1             deb
python3                       3.7.3-1                      deb
python3-distutils             3.7.3-1                      deb
python3-lib2to3               3.7.3-1                      deb
python3-minimal               3.7.3-1                      deb
python3.7                     3.7.3-2+deb10u1              deb
python3.7-minimal             3.7.3-2+deb10u1              deb

python:3 ✔️

syft -q  packages library/python:3|grep python

libpython3-stdlib             3.9.2-3                         deb
libpython3.9-minimal          3.9.2-1                         deb
libpython3.9-stdlib           3.9.2-1                         deb
mercurial                     5.6.1                           python
pip                           22.2.2                          python
python3                       3.9.2-3                         deb
python3-distutils             3.9.2-1                         deb
python3-lib2to3               3.9.2-1                         deb
python3-minimal               3.9.2-3                         deb
python3.9                     3.9.2-1                         deb
python3.9-minimal             3.9.2-1                         deb
setuptools                    63.2.0                          python
wheel                         0.37.1                          python

Busybox ❌

None of these discovers busybox as installed:

syft -q  packages library/busybox:1.35-musl
syft -q  packages library/busybox:1.35-musl
syft -q  packages library/busybox:stable

Redis ❌

None of the following picks up Redis:

syft -q  packages library/redis:7|grep redis
syft -q  packages library/redis:7-alpine|grep redis
syft -q  packages library/redis:6-alpine|grep redis

Nodejs ❌

None of the following picks up Nodejs

syft -q  packages library/node:18-alpine|grep node
syft -q  packages library/node:buster-slim|grep node
syft -q  packages library/node:buster | grep node

Traefik ❌

Picks up incorrect version

syft -q  packages library/traefik:latest|grep traefik
syft -q  packages library/traefik:1.7|grep traefik

httpd ❌

Neither of these picks up httpd/apache:

syft -q  packages library/httpd:2
syft -q  packages library/httpd:alpine

memcached ❌

None of these picks up memcached

syft -q  packages library/memcached:1 | grep memc
syft -q  packages library/memcached:alpine | grep memc

Golang ❌

syft -q  packages library/golang:alpine
syft -q  packages library/golang:buster

Consul ❌

It does report some versions, but all are incorrect

syft -q  packages library/consul | grep consul
github.com/hashicorp/consul                         v0.0.0-20220811190700-c6d0f9ecc48e     go-module
github.com/hashicorp/consul-awsauth                 v0.0.0-20220713182709-05ac1c5c2706     go-module
github.com/hashicorp/consul-net-rpc                 v0.0.0-20220307172752-3602954411b4     go-module
github.com/hashicorp/consul/api                     v1.14.0                                go-module
github.com/hashicorp/consul/sdk                     v0.11.0                                go-module

Correct version is 1.13:

docker run library/consul version
Consul v1.13.1

Nextcloud ❌

syft -q  packages library/nextcloud:23|grep next

Influxdb ❌✔️

Detected in the debian version, not in the alpine one.

syft -q  packages library/influxdb:1.8-alpine|grep influx # Breaks
syft -q  packages library/influxdb:1.8|grep influx # Works

Wordpress ❌

Not detected.

syft -q  packages library/wordpress:6
syft -q  packages library/wordpress:6-fpm-alpine|grep wordpress
.wordpress-phpexts-rundeps  20220810.051345        apk

Ruby ❌

None of the Ruby images detect ruby

syft -q  packages library/ruby:3-alpine|grep ruby
.ruby-rundeps           20220810.020529        apk
ruby-typeprof           0.20.1                 npm
ruby2_keywords          0.0.5                  gem

syft -q  packages library/ruby:3-slim|grep ruby
ruby-typeprof           0.20.1                        npm
ruby2_keywords          0.0.5                         gem

syft -q  packages library/ruby:3|grep ruby
ruby-typeprof                 0.20.1                          npm
ruby2_keywords                0.0.5                           gem

Haproxy ❌

syft -q  packages library/haproxy:2.6|grep haproxy

syft -q  packages library/haproxy:2.6-alpine|grep haproxy
.haproxy-rundeps        20220907.203036  apk

PHP ❌

syft -q  packages library/php:7.4-cli|grep php
syft -q  packages library/php:8.0-cli-alpine|grep php

Bash ❌

.bash-rundeps           20220809.175539   apk

syft -q  packages library/bash:5-alpine3.15|grep bash
.bash-rundeps           20220809.175539   apk

Vault ❌

Detects wrong version.

syft -q  packages library/vault:1.11.3|grep vault
github.com/hashicorp/hcl                                       v1.0.1-vault-3                         go-module
github.com/hashicorp/vault                                     (devel)                                go-module

Anything else we need to know?: I picked the first 30 or so images from https://hub.docker.com/search?image_filter=official&q= for the survey. The ones that get detected are either using a package manager properly (mariadb, mysql) or built as java archives which are detected easily outside the package managers (such as tomcat or sonarqube).

Python/PHP/Ruby/Node/Golang are arguably the most depended upon base images, and syft should detect the primary dependency in these images.

Usage Context

This request comes via the endoflife.date project, where we are working towards detecting EOL products by scanning SBOMs. The plan is to "leave the detection the existing SBOM ecosystem" (ie, products like syft), while we can provide feeds/PURLs/scanners to actually find EOL products in existing SBOMs.

Unfortunately, the most common usecase for eol checks (Programming Language EOL) is not met by syft, hence this issue.

spiffcs commented 2 years ago

@captn3m0 for python and busybox detection try running the power-user command provided by syft. This will execute the classifier path and should give you detection for things like python go and busybox.

Example: syft power-user library/python:3-slim

The results should be under classifications.

I'll take a look at the other reports from this issue as well.

captn3m0 commented 2 years ago

The power-user command is helpful, but since the data is missing in the SBOM, every other tool now needs to build in custom tooling. Anything relying on the standard SBOM formats will miss out on these critical dependencies.

Perhaps classifiers should provide a template alongside that can be included in the BOM?

For eg, "python-binary","3.10.7" from the classification could translate down to pkg:generic/python@3.10.7

captn3m0 commented 2 years ago

I re-ran the images against the power-user command, and the results are slightly better. There's some incorrect detections for python binaries, but that's expected with heuristic matching. Doesn't look like it picked up anything extra beyond Python, Go and Busybox however - since that's all the classifiers detect today

Here's the output from the important ones:

Scanning python:3-slim
"cpython-source","3.10.7"
"python-binary","3.10.7"

Scanning python:3-alpine
"busybox-binary","1.35.0"
"cpython-source","3.10.7"
"python-binary","3.10.7"

Scanning python:2
"cpython-source","2.7.18"
"python-binary","2.7.16"
"python-binary","2.7.18"
"python-binary","3.7.3"

Scanning python:3
"cpython-source","3.10.7"
"python-binary","3.10.7"
"python-binary","3.9.2"

Scanning busybox:1.35-musl
"busybox-binary","1.35.0"

Scanning busybox:1.35-musl
"busybox-binary","1.35.0"

Scanning busybox:stable
"busybox-binary","1.34.1"

Scanning golang:alpine
"busybox-binary","1.35.0"
"go-binary","1.1"
"go-binary-hint","1.19.1"

Scanning golang:buster
"go-binary","1.1"
"go-binary-hint","1.19.1"
"python-binary","2.7.16"
tianon commented 2 years ago

FWIW, both of those Python versions that "work" are picking up the distro build of Python, not the Python that's first in the path / intentionally provided by the image (instances of https://github.com/docker-library/python/issues/744, essentially) -- because they're the non-slim "batteries included" images, some of the "batteries" in their base layer use Python so it gets included via the distro (and that's somewhat dangerous for us to override/remove) :disappointed:

tgerla commented 1 year ago

Hi @captn3m0, I'm looking at some old issues and I have spot-checked a few of these with the latest version of Syft, and I believe we have solved most of these problems. I'll close this issue but if you run into any other missing packages, please go ahead and open a new issue and we can look into it. Thank you!

captn3m0 commented 1 year ago

Double checked all of the above, and filed #1963.