Closed sophiewigmore closed 1 year ago
After a preliminary investigation, it seemed like running something like apt policy <pkg>
for each installed package would potentially give us metadata about where the dependency came from. However, as the stack exists now, this metadata did not exist so we may need to run some other commands first to be able to get this information.
One possible solution we came up with is utilizing /etc/apt/preferences with content like:
Package: build-essential ca-certificates curl git jq libgmp-dev libssl3 libyaml-0-2 netbase openssl tzdata xz-utils zlib1g-dev
Pin: release c=multiverse
Pin-Priority: -1
Package: build-essential ca-certificates curl git jq libgmp-dev libssl3 libyaml-0-2 netbase openssl tzdata xz-utils zlib1g-dev
Pin: release c=restricted
Pin-Priority: -1
The Pin-Priority: -1
setting will prevent that the listed packages can be installed from multiverse
/restricted
. This file could be generate by jam
, based on the list of packages in stack.toml
and then passed to the docker build
via an argument, like it's done for /etc/apt/sources.list.
WDYT?
@modulo11 I looked into solutions for ages and was stumped because we remove /var/lib/apt/lists
during stack creation, which would prevent us from looking up package sources after installation. This is a great idea! Tagging other @paketo-buildpacks/stacks-maintainers to see if they have thoughts, but to me this seems like the best way to enforce this
Cool, we can draft some PR to show the implementation.
We created a draft PR for this: https://github.com/paketo-buildpacks/jam/pull/113
An alternative implementation of this is the following:
--- a/stack/build.Dockerfile
+++ b/stack/build.Dockerfile
@@ -5,6 +5,7 @@ ARG packages
ARG package_args='--no-install-recommends'
RUN echo "$sources" > /etc/apt/sources.list
+RUN echo -e "Package: $packages\nPin: release c=multiverse\nPin-Priority: -1\n\nPackage: $packages\nPin: release c=restricted\nPin-Priority: -1\n" > /etc/apt/preferences
RUN echo "debconf debconf/frontend select noninteractive" | debconf-set-selections && \
export DEBIAN_FRONTEND=noninteractive && \
apt-get -y $package_args update && \
It does not require any changes to jam (and therefore jam stays distribution agnostic), but it would duplicate the logic into all stacks.
What would you prefer?
I'm not super familiar with this apt
configuration, so I appreciate you highlighting it. Am I correct in understanding that this proposal would mean all the packages
in run.args.sources
(and build.args.sources
) would have to come from a source other than multiverse
? I.e. we disallow multiverse
for all packages that are listed in the stack.toml
?
I think this should work for the use-case we have (i.e. using the build image to pre-compile Php native extensions on the same build image but outside of the buildpack lifecycle) but I'd prefer to keep the stacks a little more extensible if possible. I would like to enable modifications of the stack.toml
without also having to modify the Dockerfile
, and in this concrete case, I'd like to allow someone to use the Dockerfile
as-is, add a multiverse
package to the stack.toml
and rebuild the stack. I believe this proposal would prevent that.
As a counter point, it is already effectively impossible to re-use the stack images and add your own package without also making your own stack.toml
- at which point you could also change the Dockerfile
- so perhaps this change is reasonable.
My ideal solution to this verification problem would be a solution where the package source validation could be performed against any existing, pre-built stack. I was hoping we could identify a solution that took the stack.toml
file and the built stack run/build images, and was able to assert that for each of the build/run images, all the packages listed in the stack.toml came from the (non-multiverse
) sources in the stack.toml. Or, at least, that none of the packages could have come from the multiverse
source.
Am I correct in understanding that this proposal would mean all the
packages
inrun.args.sources
(andbuild.args.sources
) would have to come from a source other thanmultiverse
? I.e. we disallowmultiverse
for all packages that are listed in thestack.toml
?
Yes, and also disallow restricted
I'd like to allow someone to use the
Dockerfile
as-is, add amultiverse
package to thestack.toml
and rebuild the stack. I believe this proposal would prevent that.
True, both variants of our proposal would prevent this.
a solution where the package source validation could be performed against any existing, pre-built stack.
I guess "any" needs some limitations at least something like "any Ubuntu based stack" or "any stack base on an apt based distribution".
Even with that it's not a trivial task. The metadata required for this validation is not part of the images (AFAIK). So one would need to fetch the data from the apt repo again and check if non of the installed packages are available in the multiverse
repo.
I guess "any" needs some limitations at least something like "any Ubuntu based stack" or "any stack base on an apt based distribution".
Yes, sorry. That's a good call out. I would explicitly constrain any solution to only supporting the ubuntu-based stacks we currently build.
Even with that it's not a trivial task. The metadata required for this validation is not part of the images (AFAIK). So one would need to fetch the data from the apt repo again and check if non of the installed packages are available in the multiverse repo.
Yes, agreed. I think that's fine though. To get into potential solutions for a second, I was wondering if we could do something like a job in a github workflow alongside the integration tests (or perhaps the package removal assertions) that uses the image as the FROM
for another docker image that adds back in the metadata (via apt update
or similar) and then runs an assertion on the packages via a command on the newly-created image.
I don't think we have to ship the exact image we assert against, as long as we can satisfy ourselves that creating a second image has no side effects that invalidate the assertion itself (which I think is true for something like apt update && dpkg -s | apt policy
or similar).
uses the image as the
FROM
for another docker image that adds back in the metadata (viaapt update
or similar) and then runs an assertion on the packages via a command on the newly-created image.
In general this could work. But it will cause problems with the tiny run image, I guess.
Yes, the tiny stacks would require some extra thought. I still think it's possible though.
I'd like to quickly sum up where we currently stand, to see if/how we can continue here.
The proposal to use /etc/apt/preferences
essentially prevents apt
to install packages from certain component, e.g. multiverse
. This could either be enforced in jam
(https://github.com/paketo-buildpacks/jam/pull/113/commits/d66fc0e2cfd0f0bf8ccc6e9a44c9251b8a5c5bc0), potentially hidden behind some flag, or directly in the Dockerfile
(https://github.com/paketo-buildpacks/bionic-tiny-stack/issues/40#issuecomment-1230380234). /etc/apt/preferences
could also be removed after all packages are installed to leave it open for extensions.
This approach would work on any stack, including tiny, as it enforces components right at the root instead of asserting it on the final image. Do you think we can continue with the approach @robdimsdale or drop the idea?
Hey @modulo11 and @phil9909, I missed this whole thread while I was OOO. I think that removing /etc/apt/preferences
after package install is clever, but would still cause issues with Rob's request to enable users to modify the stack.toml with a multiverse
package while leaving the Dockerfile as-is. Unless I've misunderstood what you meant?
I'm intrigued by the hidden feature flag in jam
potentially, since we could just use that flag in our own builds. I'm not going to speak for Rob though, and he has a more in-depth understanding of the concerns here. As a heads up, he is on vacation this week himself, so it'll probably be a week before he can weigh in.
Hi all,
I appreciate the investigation work and the exploration into multiple ways to achieve the outcome of having confidence that multiverse
packages weren't installed on the Paketo stacks.
I think the path that makes the most sense to me is to restrict the sources via /etc/apt/preferences
in the Dockerfile
and remove this change when we're done (i.e. in the Dockerfile
after installing packages). I agree that this means changing the Dockerfile
s in each stack, but I'm ok with that duplication.
Adding this to the suggestion from above results in a diff that would look something like the following (not actually a valid diff):
--- a/stack/build.Dockerfile
+++ b/stack/build.Dockerfile
@@ -5,6 +5,7 @@ ARG packages
ARG package_args='--no-install-recommends'
RUN echo "$sources" > /etc/apt/sources.list
+RUN echo -e "Package: $packages\nPin: release c=multiverse\nPin-Priority: -1\n\nPackage: $packages\nPin: release c=restricted\nPin-Priority: -1\n" > /etc/apt/preferences
RUN echo "debconf debconf/frontend select noninteractive" | debconf-set-selections && \
export DEBIAN_FRONTEND=noninteractive && \
apt-get -y $package_args update && \
[...]
+RUN rm /etc/apt/preferences
I think this solution is simpler than my alternative proposal of running an assertion against the built stack images.
For completeness, here are my thoughts about the other proposed solutions:
jam
:
I don't think it makes sense for jam
to have ubuntu-specific knowledge. Even if we abstracted it to allow for different types of operating systems, I don't think that behavior should live in jam
. I think jam create-stack
should delegate the OS-specific details to stack.toml
and Dockerfile
files.Dockerfile
for the Paketo stacks (without removing it):
I think there is value in being able to pre-compile dependencies on the same image as the stack itself, and I think it is reasonable that some buildpacks are allowed to pull in multiverse
packages. This is exactly the use-case we encountered in Php. I want to be able to have confidence that the stack images we ship for build and run do not have multiverse
packages, while simultaneously allowing buildpacks to pre-compile whatever the need on the same stack image.@modulo11 @phil9909 how does this sound to you?
Hi @robdimsdale,
sounds fine 👍 . Should this be implemented for both, build
and run
images?
If that's clear we'd continue and open a bunch of PRs for jammy
and bionic
.
Yes, I think we'd like to implement this for both build
and run
. Run is probably more important than build
but they're both valuable and I see no reason to omit build
.
Just a quick note, when I tested this, i had to drop the -e
from the echo -e ... > /etc/apt/preferences
command.
I tested this proposal by adding the aac-enc
package (which comes from restricted
) and observing that before adding the package pinning change above, this package was installed successfully, and after this packaging pinning change the build failed to install this package. This demonstrates the inability to install a multiverse
package on the stack during stack creation.
Additionally, with this package pinning in place, I was able to install the aac-enc
package on the built image. This demonstrates the ability to install packages from other sources once the image is built.
With the addition of #70 this is complete. Closing.
In #39 we re-added the
multiverse
package source to the stack images with the intention of enabling use cases that might need packages from this location.We do not want to ship any packages from
multiverse
in the stacks themselves, as explained in the linked PR. To prevent against this in the future, we should add a test assertion to the suite that checks that no packages come from themultiverse
repository.