chainguard-dev / darkfiles

Darkfiles finds orphaned files in container images and makes them to bad deeds
Apache License 2.0
41 stars 11 forks source link

Negative files :) #3

Open amouat opened 2 years ago

amouat commented 2 years ago

I wasn't expecting this output:

./darkfiles stats --distro=debian debian:latest
INFO flattening image index.docker.io/library/debian
INFO flattened image to /var/folders/g9/k525vdmx1ks2hy9qktq1x4qh0000gn/T/image-dump-453542286.tar (123 MB)
Total files in image:       2896
Files in packages:          2898
Files not in packages:      -2
Tracked by package manager: 100.069061%

I did expect it to be around 100% as it is the base image.

redis:latest is even worse.

kstevena commented 1 year ago

I came across the same, it comes from the fact that it assume all files part of a package will be present in the image, when it is not the case the stats are biaised. For the debian latest the following are referenced in packages but not found in the image:

not found in image /.
not found in image /var/cache/apt/archives
not found in image /var/cache/apt/archives/partial
not found in image /var/lib/apt/lists/partial

And the two files within /var/cache/ are filtered, then there are at the end two "extra" file.

amouat commented 1 year ago

Lol, it is sort of negative files then :)

It would make sense to have a separate stat for this.

kstevena commented 1 year ago

IMHO as this tool focus on identifying content present in the image but not managed by the OS package manager, this could simply be discarded. In fact there are two cases: