Open pombredanne opened 3 years ago
Gentle ping :)
We have the list of installed packages in /bar/lib/dpkg/status, you're requesting a list of installed files, mapped back to the packages?
cc @loosebazooka
@dlorenc
you're requesting a list of installed files, mapped back to the packages?
yes
Yeah I think this needs to be solved in rules_docker. Can you point to the debian docs for this, that would be helpful.
@loosebazooka See
Now since you already departed from the standard dpkg Debian layout with the status.d/ layout, feel free to use what you like.
IMHO the simplest would be something such as /var/lib/dpkg/info/package_name.list list of files and directories installed by the package
stored side-by-side with the status file.
e.g. given /var/lib/dpkg/status.d/tzdata
that contains package status for tzdata
, /var/lib/dpkg/status.d/tzdata.list
would be the list of installed paths for tzdata
one line per path.
It would be nice to also document this of course (including the actual use of status.d/ and the corresponding copyright files that are already there)
For reference I also entered https://github.com/bazelbuild/rules_docker/issues/1876 way back when (some would say this is a double post... but I was not sure where to post what ;) )
Yeah I'm not exactly sure about this history of this change. So I'll have to do some reading, but thanks for the link.
@loosebazooka
I'm not exactly sure about this history of this change.
I am not sure what you mean by this... but if you mean about when the status.d files were introduced and what was there before, this looks simple from what I can see.
There was a single commit that introduced keeping some metadata in https://github.com/bazelbuild/rules_docker/commit/f5432b813e0a11491cf2bf83ff1a923706b36420 which essentially takes the control
file and dumps it under status.d/
Before no metadata was kept https://github.com/bazelbuild/rules_docker/blob/3caf72f166f8b6b0e529442477a74871ad4d35e9/container/build_tar.py#L181
I can provide a patch in rules docker that would have either one of these effects in https://github.com/bazelbuild/rules_docker/blob/e5368f9c425854ddb5af31624f0a6b99a0d3f1fb/container/build_tar.py#L224
Do you want such a patch?
@loosebazooka gentle ping... do you want a patch here or at https://github.com/bazelbuild/rules_docker/issues/1876?
Oh sorry, yeah I mean I don't know why this form of metadata was chosen. Anyway, it seems like the correct place to inject the metadata is in rules_docker. Please provide a patch there.
@pombredanne gentle ping, any news on the patch?
@pombredanne gentle ping, any news on the patch?
I have not attacked this yet. Do you want to chip in and help?
Let's continue the discussion over at bazelbuild/rules_docker#1876
@loosebazooka FYI I pushed a fix in https://github.com/bazelbuild/rules_docker/pull/2065 and your review is mucho welcomed there
@thesayyn are these covered in the new rules_oci?
@thesayyn are these covered in the new rules_oci?
Yes. it is.
NOTE: some packages don't have an md5sums file, in that case, it is absent.
@thesayyn are these covered in the new rules_oci?
Yes. it is.
Does this mean we should already see it reflected in new images?
NOTE: some packages don't have an md5sums file, in that case, it is absent.
at least for the packages that have md5sums files
@fedemengo not yet. We're in the middle of a larger transition to rules_oci and when that is complete, you will being to see this metadata.
awesome, thanks for the update
@loosebazooka you wrote:
We're in the middle of a larger transition to rules_oci and when that is complete, you will being to see this metadata.
Hey! is the transition done?
looks like after https://github.com/GoogleContainerTools/distroless/pull/1367 the new images contain the expected metadata
Since distroless are primarily built with Bazel I filed this issue https://github.com/bazelbuild/rules_docker/issues/1876 that am repasting here... but I reckon this may need to be tracked here instead:
🚀 feature request
Relevant Rules
When a package is installed, only metadata are kept and the list of installed files is lost/not saved with the package metadata.
I have a concern with what happens here: https://github.com/bazelbuild/rules_docker/blob/d18033b7eb3429a55dc4a579b5c19af57ab25e5f/container/build_tar.py#L224
Description
In a distroless container image, the as-installed .deb packages are not saved with their files/md5sums file lists in what would be in
/var/lib/dpkg/info
on a regular Debian install. As a result, it is not possible to relate an installed package in a distroless image/layer to the set of files that were installed with this package.This data can be important for software composition analysis and its security and license compliance tracking applications.
Describe the solution you'd like
Each installed package should include some installed file listing possibly added in some per package file in the
status.d/
directory. This is a Debian standard in/var/lib/dpkg/info/<package name>
This would make distroless images more readily introspectable and observable, otherwise there is no intrinsic way to relate a package (in status.d) to the set of its installed files.
@tejal29 you committed this originally with @dlorenc ... any insight to share there?
Describe alternatives you've considered
I cannot fathom an in-container alternative to keep a tab of each packaged-installed file. Tracking outside would mean maintaining some external database which does not seem practical.