paketo-buildpacks / bionic-tiny-stack

Tiny (minimal) stack for Ubuntu 18.04: Bionic Beaver
Apache License 2.0
4 stars 1 forks source link

Add test assertion to make sure we don't install multiverse packages #40

Closed sophiewigmore closed 1 year ago

sophiewigmore commented 2 years ago

In #39 we re-added the multiverse package source to the stack images with the intention of enabling use cases that might need packages from this location.

We do not want to ship any packages from multiverse in the stacks themselves, as explained in the linked PR. To prevent against this in the future, we should add a test assertion to the suite that checks that no packages come from the multiverse repository.

sophiewigmore commented 2 years ago

After a preliminary investigation, it seemed like running something like apt policy <pkg> for each installed package would potentially give us metadata about where the dependency came from. However, as the stack exists now, this metadata did not exist so we may need to run some other commands first to be able to get this information.

modulo11 commented 2 years ago

One possible solution we came up with is utilizing /etc/apt/preferences with content like:

Package: build-essential ca-certificates curl git jq libgmp-dev libssl3 libyaml-0-2 netbase openssl tzdata xz-utils zlib1g-dev
Pin: release c=multiverse
Pin-Priority: -1

Package: build-essential ca-certificates curl git jq libgmp-dev libssl3 libyaml-0-2 netbase openssl tzdata xz-utils zlib1g-dev
Pin: release c=restricted
Pin-Priority: -1

The Pin-Priority: -1 setting will prevent that the listed packages can be installed from multiverse/restricted. This file could be generate by jam, based on the list of packages in stack.toml and then passed to the docker build via an argument, like it's done for /etc/apt/sources.list.

WDYT?

sophiewigmore commented 2 years ago

@modulo11 I looked into solutions for ages and was stumped because we remove /var/lib/apt/lists during stack creation, which would prevent us from looking up package sources after installation. This is a great idea! Tagging other @paketo-buildpacks/stacks-maintainers to see if they have thoughts, but to me this seems like the best way to enforce this

modulo11 commented 2 years ago

Cool, we can draft some PR to show the implementation.

phil9909 commented 2 years ago

We created a draft PR for this: https://github.com/paketo-buildpacks/jam/pull/113

An alternative implementation of this is the following:

--- a/stack/build.Dockerfile
+++ b/stack/build.Dockerfile
@@ -5,6 +5,7 @@ ARG packages
 ARG package_args='--no-install-recommends'

 RUN echo "$sources" > /etc/apt/sources.list
+RUN echo -e "Package: $packages\nPin: release c=multiverse\nPin-Priority: -1\n\nPackage: $packages\nPin: release c=restricted\nPin-Priority: -1\n" > /etc/apt/preferences
 RUN echo "debconf debconf/frontend select noninteractive" | debconf-set-selections && \
   export DEBIAN_FRONTEND=noninteractive && \
   apt-get -y $package_args update && \

It does not require any changes to jam (and therefore jam stays distribution agnostic), but it would duplicate the logic into all stacks.

What would you prefer?

robdimsdale commented 2 years ago

I'm not super familiar with this apt configuration, so I appreciate you highlighting it. Am I correct in understanding that this proposal would mean all the packages in run.args.sources (and build.args.sources) would have to come from a source other than multiverse? I.e. we disallow multiverse for all packages that are listed in the stack.toml?

I think this should work for the use-case we have (i.e. using the build image to pre-compile Php native extensions on the same build image but outside of the buildpack lifecycle) but I'd prefer to keep the stacks a little more extensible if possible. I would like to enable modifications of the stack.toml without also having to modify the Dockerfile, and in this concrete case, I'd like to allow someone to use the Dockerfile as-is, add a multiverse package to the stack.toml and rebuild the stack. I believe this proposal would prevent that.

As a counter point, it is already effectively impossible to re-use the stack images and add your own package without also making your own stack.toml - at which point you could also change the Dockerfile - so perhaps this change is reasonable.

My ideal solution to this verification problem would be a solution where the package source validation could be performed against any existing, pre-built stack. I was hoping we could identify a solution that took the stack.toml file and the built stack run/build images, and was able to assert that for each of the build/run images, all the packages listed in the stack.toml came from the (non-multiverse) sources in the stack.toml. Or, at least, that none of the packages could have come from the multiverse source.

phil9909 commented 2 years ago

Am I correct in understanding that this proposal would mean all the packages in run.args.sources (and build.args.sources) would have to come from a source other than multiverse? I.e. we disallow multiverse for all packages that are listed in the stack.toml?

Yes, and also disallow restricted

I'd like to allow someone to use the Dockerfile as-is, add a multiverse package to the stack.toml and rebuild the stack. I believe this proposal would prevent that.

True, both variants of our proposal would prevent this.

a solution where the package source validation could be performed against any existing, pre-built stack.

I guess "any" needs some limitations at least something like "any Ubuntu based stack" or "any stack base on an apt based distribution". Even with that it's not a trivial task. The metadata required for this validation is not part of the images (AFAIK). So one would need to fetch the data from the apt repo again and check if non of the installed packages are available in the multiverse repo.

robdimsdale commented 2 years ago

I guess "any" needs some limitations at least something like "any Ubuntu based stack" or "any stack base on an apt based distribution".

Yes, sorry. That's a good call out. I would explicitly constrain any solution to only supporting the ubuntu-based stacks we currently build.

Even with that it's not a trivial task. The metadata required for this validation is not part of the images (AFAIK). So one would need to fetch the data from the apt repo again and check if non of the installed packages are available in the multiverse repo.

Yes, agreed. I think that's fine though. To get into potential solutions for a second, I was wondering if we could do something like a job in a github workflow alongside the integration tests (or perhaps the package removal assertions) that uses the image as the FROM for another docker image that adds back in the metadata (via apt update or similar) and then runs an assertion on the packages via a command on the newly-created image.

I don't think we have to ship the exact image we assert against, as long as we can satisfy ourselves that creating a second image has no side effects that invalidate the assertion itself (which I think is true for something like apt update && dpkg -s | apt policy or similar).

phil9909 commented 2 years ago

uses the image as the FROM for another docker image that adds back in the metadata (via apt update or similar) and then runs an assertion on the packages via a command on the newly-created image.

In general this could work. But it will cause problems with the tiny run image, I guess.

robdimsdale commented 2 years ago

Yes, the tiny stacks would require some extra thought. I still think it's possible though.

modulo11 commented 2 years ago

I'd like to quickly sum up where we currently stand, to see if/how we can continue here.

The proposal to use /etc/apt/preferences essentially prevents apt to install packages from certain component, e.g. multiverse. This could either be enforced in jam (https://github.com/paketo-buildpacks/jam/pull/113/commits/d66fc0e2cfd0f0bf8ccc6e9a44c9251b8a5c5bc0), potentially hidden behind some flag, or directly in the Dockerfile (https://github.com/paketo-buildpacks/bionic-tiny-stack/issues/40#issuecomment-1230380234). /etc/apt/preferences could also be removed after all packages are installed to leave it open for extensions.

This approach would work on any stack, including tiny, as it enforces components right at the root instead of asserting it on the final image. Do you think we can continue with the approach @robdimsdale or drop the idea?

sophiewigmore commented 2 years ago

Hey @modulo11 and @phil9909, I missed this whole thread while I was OOO. I think that removing /etc/apt/preferences after package install is clever, but would still cause issues with Rob's request to enable users to modify the stack.toml with a multiverse package while leaving the Dockerfile as-is. Unless I've misunderstood what you meant?

I'm intrigued by the hidden feature flag in jam potentially, since we could just use that flag in our own builds. I'm not going to speak for Rob though, and he has a more in-depth understanding of the concerns here. As a heads up, he is on vacation this week himself, so it'll probably be a week before he can weigh in.

robdimsdale commented 2 years ago

Hi all,

I appreciate the investigation work and the exploration into multiple ways to achieve the outcome of having confidence that multiverse packages weren't installed on the Paketo stacks.

I think the path that makes the most sense to me is to restrict the sources via /etc/apt/preferences in the Dockerfile and remove this change when we're done (i.e. in the Dockerfile after installing packages). I agree that this means changing the Dockerfiles in each stack, but I'm ok with that duplication.

Adding this to the suggestion from above results in a diff that would look something like the following (not actually a valid diff):

--- a/stack/build.Dockerfile
+++ b/stack/build.Dockerfile
@@ -5,6 +5,7 @@ ARG packages
 ARG package_args='--no-install-recommends'

 RUN echo "$sources" > /etc/apt/sources.list
+RUN echo -e "Package: $packages\nPin: release c=multiverse\nPin-Priority: -1\n\nPackage: $packages\nPin: release c=restricted\nPin-Priority: -1\n" > /etc/apt/preferences
 RUN echo "debconf debconf/frontend select noninteractive" | debconf-set-selections && \
   export DEBIAN_FRONTEND=noninteractive && \
   apt-get -y $package_args update && \
   [...]
+RUN rm /etc/apt/preferences

I think this solution is simpler than my alternative proposal of running an assertion against the built stack images.

For completeness, here are my thoughts about the other proposed solutions:

@modulo11 @phil9909 how does this sound to you?

modulo11 commented 2 years ago

Hi @robdimsdale, sounds fine 👍 . Should this be implemented for both, build and run images? If that's clear we'd continue and open a bunch of PRs for jammy and bionic.

robdimsdale commented 2 years ago

Yes, I think we'd like to implement this for both build and run. Run is probably more important than build but they're both valuable and I see no reason to omit build.

Just a quick note, when I tested this, i had to drop the -e from the echo -e ... > /etc/apt/preferences command.

robdimsdale commented 2 years ago

I tested this proposal by adding the aac-enc package (which comes from restricted) and observing that before adding the package pinning change above, this package was installed successfully, and after this packaging pinning change the build failed to install this package. This demonstrates the inability to install a multiverse package on the stack during stack creation.

Additionally, with this package pinning in place, I was able to install the aac-enc package on the built image. This demonstrates the ability to install packages from other sources once the image is built.

robdimsdale commented 1 year ago

With the addition of #70 this is complete. Closing.