Elektrobit / flake-pilot

Registration/Control utility for applications launched through a runtime-engine, e.g containers
MIT License
9 stars 5 forks source link

Add 3 way dependency resolution #37

Closed rjschwei closed 1 year ago

rjschwei commented 1 year ago

Currently oci-ctl can resolve dependencies between 2 containers/layers. For example if I have a Python application that depends on a specific interpreter, say 3.10 I can build a delta container that has only my bits and then use the interpreter from python-3-10-basesystem and oci-ctl handles the layers and lets me register "my-cmd" and makes it appear to the user as if "my-cmd" is installed on the host system, i.e. the containers are transparent.

There is another use-case where my application may also depend on something I know to be part of the host system, the package manager is a good example. As such it would be great if oci-ctl could also layer the proper parts of the host os into the view of the file system for "my-cmd". In the flakes file this could be handled with additional directives such as

hostsystem:
    - zypper

oci-ctl could inspect the package database to determine which bits and pieces need to be visible to "my-cmd" to make this work.

The approach can be thought of in the way that open linking works. I can link a C application in a way that the linker doesn't complain at link time when a specific library is not found, the resolution is deferred to run time.

schaefi commented 1 year ago

Thanks, as we were discussing this I had a nice idea how this could be solved. Will come up with more details also coupled to a pull request and an example.

schaefi commented 1 year ago

@rjschwei Ok here is how I think this could be done. If we set the dependency as part of a delta container description it is imho the right place to express we need this package for the container app to run. However, we don't want it be a part of the delta container but provisioned later at startup time by oci-pilot. We indicate this by uninstalling the package also specified as part of the delta container description. An example what I meant by that can be found here:

In kiwi (in the delta branch) the patch was extended to be able to detect these differences between the installation and the uninstallation/delete of packages/data. The changeset is created as a file /vanished inside of the delta container. This has the advantage that other tools can consume this information without a dependency resolution. It also has the advantage to work distribution independent and it has the advantage to also detect a chain of dependencies. For example the uninstall operation of SUSEConnect in the example above actually uninstalls also dmidecode which is a dependency of SUSEConnect that also gets dropped at uninstall time. You get a better understanding of it if you look into the /vanished file that exists now in each of the delta containers (just untar the container tarball)

The information in the /vanished file can also contain a few "false positives" which is data that gets changed due to the fact that an uninstall procedure also created a log data or changes the rpm database files. The obvious ones kiwi will skip itself, however others could also be valid changes and will stay. It will be in the implementation of the provisioning step of oci-pilot here to take further influence when needed.

The implementation I plan here in oci-pilot will make use of the /vanished metadata inside of the delta containers to provision data from the host into the container as you requested as the third source for dependencies. I'm confident that this concept is stable enough to meet the needs. However, you can imagine as soon as oci-pilot mixes data from the host with the container instance it will be possible to produce something broken. That's imho a part we have to accept for achieving host connected app containers.

Thoughts ?

rjschwei commented 1 year ago

@schaefi vanished has a bit of mysterious ring to it, as in "disappeared without a real explanation" as such we might want to choose a word that indicates intend, i.e. this happened because of user action. Could be any of removed, withdrawn, eliminated, abolished, separated.

As I was looking at the example and also a rudimentary look at the delta branch implementation in kiwi I was wondering if we should even support

<packages type="image">

or if we should enforce

<packages type="bootstrap">

If we have the image type this implies that we get the update stack pulled in, if I remember that correctly, because kiwi will do a chroot into the image that is being built. However, if we use bootstrap then kiwi will use the update stack/package manager form the build host and direct it to install packages to the target directory where we are building the image. For delta containers I think it is reasonable to assume that the update stack will always be supplied by something else. This would also imply that the currently named /vanished file would always contain the name of the packagemanager file.

schaefi commented 1 year ago

@schaefi vanished has a bit of mysterious ring to it

LOL, ok makes sense, let's change that

schaefi commented 1 year ago

This would also imply that the currently named /vanished file would always contain the name of the packagemanager file.

It actually does not because remember it builds against a base image and all data that is already in the base image does not vanish ;) Since the base image in my case is a standard system it contains the zypper stack in the base.

I also thought a delta image should be built from bootstrap only such that there is no need to include the rpm stack but as there is always a base image which in my thinking provides an in itself clean mini OS the benefit between bootstrap only or bootstrap + image does not really exist. If the base image itself is an "incomplete" OS you have a point. But I did not thought that we want such a base ?

schaefi commented 1 year ago

So if you use <packages type="bootstrap"> in a delta description which uses a base image that already contains the rpm stack this will cause a runtime error:

[ ERROR   ]: 21:51:37 | KiwiCommandError: chroot: stderr: error: failed to replace old database with new database!
error: replace files in /usr/lib/sysimage/rpm with files from /usr/lib/sysimage/rpmrebuilddb.11970 to recover
, stdout: (no output on stdout)

which actually makes sense because we try to bootstrap an overlay tree that already has everything. So if we force delta containers to bootstrap only we also create a tight requirement to any base image to NOT contain the rpm database.

The current approach as I explained assumes the opposite. From a description point of view it also looks more natural to me that you install additional "image" packages to a base system and create a delta from it later.

Thoughts ?

rjschwei commented 1 year ago

I would like a base image that is not a "system" image. If I use [1] as an example then I see this stack up as follows

  1. Build a "baseimage" which is a container that has

    <packages type="bootstrap">
    <package name="python3"/>
    <package name="python3-M2Crypto"/>
    <package name="python3-lxml"/>
    <package name="python3-requests"/>
    <package name="python3-urllib3"/>

    This forms the "baseimage" that I can re-use for every other code base that only needs those Python modules. Or I could have a "baseimage" that contains all Python modules we need to have a working distro.

  2. my delta container for [1] would then have

    ....
    <type image="docker" derived_from="obs:...basepython#3.8" delta_root="true">
    <containerconfig name="cloud-regionsrv-client"
    </type>
    ....
    <packages type="bootstrap">
    <package name="cloud-regionsrv-cloud"/>
    ....and a few others...
    </packages>
    < packages type="uninstall">
    <package name="SUSEConnect"/>
    <package name="ca-certificates"/>
    </packages>

The plugin this delivers for zypper would have to be written in some compiled language and be installed as part of the "Host OS"

In this example I get my Python stuff from some common "baseimage" that everyone that wants to build against Python 3.8 can use and then I get SUSEConnect, which in turn needs zypper from the HostOS.

I guess one question that arises in this is how far does the "delta concept " get pushed/does it apply vs. the built in layering of the container system. Because the cloud-regionsrv-client container could be built deriving from some Python container without delta=True, however somewhere in the overlay mechanism of the layers there is a chance that the update stack exists in container. Consider the person building the Python container using a Dockerfile and inheriting from a SUSE BCI. In that case we enter the full ugliness of duplication of the container system. There's a zypper on the "Host OS", then there is a zypper in who knows how many layers based on versioning of containers that either need zypper or just pull it in because hey do not know any better.

This is where I see delta-containers as being able to make a giant difference in not having the totally brain dead duplication. And then in combination with oci-pilot one can build a layered that has a unified and enforced view of the update stack and the system rpmdb actually knows nothing about what's in any given container and it doesn't care.

We could also totally eliminate the "oh I'll run a shell in a container and add a package" abomination, i.e. a true immutable image because the stuff is simply not there. While /usr/bin/sh my be visible in the delta-container instance it ultimately comes from the HostOS as does the update stack and as such getting a shell in the container and then running zypper up would fail if the "Host OS" is setup for transactional-update.

[1] https://github.com/SUSE-Enceladus/cloud-regionsrv-client

rjschwei commented 1 year ago

I suppose both use cases are equally valid and need to be served, i.e. I could have a base image that looks like an OS without kernel, i.e. it has a shell, an update stack etc. or I could have a system of delta containers that I layer where I get the update stack only from the bottom, which is the "Host OS". In the later case I'd build all may containers with

<packages type="bootstrap">
...
</packages>
<packages type="uninstall">
....
</packages>

an in the former case where I have a "basesystemimage", note I liberally expanded the used name ;) then the delta container would get built with

<packages type="image">
....
</packages>
schaefi commented 1 year ago

I would like a base image that is not a "system" image

yeah got it

I suppose both use cases are equally valid

yes and I want both of them to be possible

ok with regards to your use case, you basically want to use a delta as your base. I created the container you want to use use a base here:

Next I changed the aws-cli app container to use this basepython container as its base. As you might expect we have to add stuff in bootstrap now because the base is not a system. That works but of course the zypper dependency resolution pulls in other packages in order to get to a clean system. The data we want to take from the host we now need to classify either as clean uninstalls or also as hard deletions. See the aws-cli container now as a result of this:

The result should be relatively near to what you are after. For the app to work we have to add the code here in oci-pilot that does the provisioning of the vanished data (name still to be changed). So aws-cli is a real three-way dependent app container. It needs basepython + aws-cli + vanished-files-from-host

The way we do this I think is still very intrusive but a maintainable concept, last famous words :)

What do you think ?

schaefi commented 1 year ago

@rjschwei if you agree that we are sort of on the right track then I would do:

rjschwei commented 1 year ago

@schaefi I think we are on the right track, this concept provides the flexibility needed and makes a high level containerized application like the aws-cli look like a tool installed on the system while at the same time reducing duplication in depth and breadth of he consumed containers.

schaefi commented 1 year ago

@rjschwei Sounds good. I pushed a working version of the 3 way dependency provisioning now to my test project on obs:

It would be great if you can also play a bit and test. Testing can be done as follows:

  1. fetch the test-os VM

    osc co home:marcus.schaefer:delta_containers
    cd test-os
    ./run
  2. ssh to the test-os

    ssh -p 10022 root@localhost
  3. have fun with oci-ctl

    oci-ctl pull --uri registry.opensuse.org/home/marcus.schaefer/delta_containers/containers/basepython:latest
    oci-ctl pull --uri registry.opensuse.org/home/marcus.schaefer/delta_containers/containers/aws-cli:latest
    
    oci-ctl register --container aws-cli --app /usr/bin/aws --base basepython
    
    aws ec2 help

You can go and register other example apps from the project.

In the end it would be imho good to have a working example implementing your use case as a reference

schaefi commented 1 year ago

if you want to look into the contents of an app container or double check on the (still named) vanished file you can do this as follows:

podman image mount aws-cli

follow the printed mount path and check contents, the vanished file etc...

I was wondering why the aws-cli is still relatively big but it turned out that botocore and other python stuff in that area sums up to quite some data, so I think it's ok

schaefi commented 1 year ago

provisioning 3 way is an expensive operation depending on how much data needs to be shuffled around. I still think the decision for a containerized app needs to be taken wisely. Applications that you call a thousand times with different caller options and a big set of host dependencies are probably not a wise choice since they would not differ much from a simple package install on the host os :) Lot's of scenarios are possible, not all of them makes sense imho

schaefi commented 1 year ago

@rjschwei I'm going to close this issue now as fixed. The 3way dependency provisioning is now available in oci-pilot and I have done several tests. You can see delta container examples using the different dependency models in my OBS project here: https://build.opensuse.org/project/show/home:marcus.schaefer:delta_containers

I will now polish the delta branch in kiwi which allows to create this sort of containers and open a PR on kiwi for it. The other component, oci-pilot is a new project which does not exist in Leap/ALP/SLE but I think it has potential to be a door opener for ALP and SLE regarding real/maintainable application containers.

It would be great if you can share/point to the work done and collectively enabled in the OBS project, in the architecture review meeting and keep me posted on the feedback. Thanks in adavance