Added system stash command

schaefi commented 3 years ago

The system stash command allows to create a container image from a given root tree. The result can be considered a cache file for the stackbuild command.

schaefi commented 3 years ago

@davidcassany this is a very simple implementation of the stash command to turn a given root tree into a docker container. I'd like to discuss this code and have some questions.

If the stash command is called for the purpose to create a root tree cache, I don't think we will ever need a derived container reference, or am I mistaken ?
I intentionally did not read an in-tree XML description from the build of the root-tree as I think the information about the author, tag and container name is specific to the creation of the cache and not necessarily connected with the kiwi build that created the root tree. Thoughts ?
I think it would be good to optionally register the image in a local registry via podman. I would leave the remote push outside of the plugin code and in the hands of the user though

Feedback welcome

Thanks

Conan-Kudo commented 3 years ago

I'll take a look at this a bit later (working on a bunch of things atm), but @iammattcoleman, can you look at this?

schaefi commented 3 years ago

Thanks, I think if people looking at it who might not have the information from the initial conversation it is helpful to share some background information here:

We would like to create two new sub-commands in a plugin called stackbuild

stackbuild
stash

The idea is to allow a kiwi image build based on the contents of a container which serves as the root-tree to begin with. One use case would be to rebuild an image from a cache. In terms of commands this would look like the following

$ kiwi-ng system build ... --target-dir /tmp/myimage
$ kiwi-ng system stash ... --root /tmp/myimage/build/image-root --target-dir /tmp/caches

... some time later

$ kiwi-ng system stackbuild --container-image cache_name -- ... --target-dir /tmp/myimage-rebuild

stackbuild should be allowed to use any container from some registry, such that the use case is not limited to just caches. The stash command as implemented here just exists to allow the creation of a container from some given root tree. I like to extend the stash command to also do podman import ... such that the local registry has this stash image, but this is not yet implemented as I'd like to have a conversation about the current state and if all that makes sense

Thanks

davidcassany commented 3 years ago

* If the stash command is called for the purpose to create a root tree cache, I don't think we will ever need a derived container reference, or am I mistaken ?

This is indeed a good question. My idea is that you (as a user) don't need a derived container reference, but you (as a developer) certainly need to take into account caches that might be built on top of another one. So you don't need the reference because in the cli because this is already part of the root tree, hence if any, it already fixed. I'd follow a similar logic as we do for containers build in stackbuild command. The result of a root tree of a stackbuild command should include, together with the XML description, the original container it is based on, this way if after the stackbuild command the user calls stash it has the chance to append on top of the original container an extra layer including only the changes applied during the stackbuild.

schaefi commented 3 years ago

@davidcassany finally I found the time to refactor the code for the stash command and put on the feedback I got from your side. Also added tests and worked with the stash command to see if it actually makes sense. From my perspective it does as it is right now but probably there are still some questions to be clarified

schaefi commented 3 years ago

This is indeed a good question. My idea is that you (as a user) don't need a derived container reference, but you (as a developer) certainly need to take into account caches that might be built on top of another one

yep and done that way now, every new stash will add a layer on top of an existing one, if present

schaefi commented 3 years ago

I also thought a list command would be useful to list the available stashes

schaefi commented 3 years ago

@davidcassany And after that I finally reached the state where you have started to work on the XML merging strategy. The way the stash container is created will currently include the image XML configuration from the root-path to stash. Now if one uses this stash for a build of an image I was questioning myself:

Treat the stash embedded image configuration as the primary config This means you basically build the stash image with modifications and ignore the build image config
Treat the build image config as the primary config This means the stash config gets overwritten
Try some merging strategy between the stash config and the build image config At this point I could imagine a full can of worms and unexpected results

At the moment I think the stash should not include the kiwi image description, such that it can be used generically for any build process. This of course can lead to inconsistent systems, e.g selecting a leap stash to build a TW image. I would expect the build process based on incompatible stash roots to fail at the package manager level. From my perspective this is a responsibility of the user to create and re-use stash containers in a workable way. As there are so many pitfalls with pre-defined root environments which we don't care for in kiwi when using "--allow-existing-root" I think we also don't need to try to be clever when the root comes from a container stash.

All this just my thoughts and I'm happy if we can have a conversation about it

Thanks much

davidcassany commented 3 years ago

The way the stash container is created will currently include the image XML configuration from the root-path to stash.

I think this is the right way, so the image data is self contained. So it can be used for reproducibility use cases as I do not have to worry about keeping the original XML (imagine they are part of an evolving git repository, I don't want to be forced to keep the stashed image and its generator commit together). I do not remember if runtime choices are kept, mostly the profile that created the root-tree (probably the ideal case would be to keep the profiled XML), I would not care about specific runtime flags changing repos or stuff like that, those cases are out of scope IMHO.

Now if one uses this stash for a build of an image I was questioning myself:
* Treat the stash embedded image configuration as the primary config
  This means you basically build the stash image with modifications and ignore the build image config

I am not sure I understand what you mean. But I'd never give priority to the stashed XML compared some XML provided by the user.

* Treat the build image config as the primary config
  This means the stash config gets overwritten

I believe this makes sense, if I provide an XML, I want KIWI to make use of it, regardless of whatever there was in the stashed image.

* Try some merging strategy between the stash config and the build image config
  At this point I could imagine a full can of worms and unexpected results

LOL :laughing: Yes, I would not consider a merging two XMLs that are compliant with the our schema. This is too complex, profiling could become a real mess. I quickly elaborated on my idea of merging XML below, which follows a different approach. Not that I really believe this is something we need, just as a mental exercise. :smile:

At the moment I think the stash should not include the kiwi image description, such that it can be used generically for any build process. This of course can lead to inconsistent systems, e.g selecting a leap stash to build a TW image. I would expect the build process based on incompatible stash roots to fail at the package manager level. From my perspective this is a responsibility of the user to create and re-use stash containers in a workable way. As there are so many pitfalls with pre-defined root environments which we don't care for in kiwi when using "--allow-existing-root" I think we also don't need to try to be clever when the root comes from a container stash.

I slightly disagree, while I agree we should not try to be smart I'd expect a slightly different behavior. Let me consider few cases:

Building on top of a stashed image coming from anywhere, no checks, no stashed XML. Clearly the user has to provide an XML otherwise there is no way KIWI can do anything.
Building on top of a stashed image created by KIWI and the user provides an XML. In this case the XML provided by the user takes precedence and all the project stored within the stashed image is wiped and import phase (XML, config.sh, etc.)
Building on top of a stashed image created by KIWI and the user does not provide an XML. In that case the in image XML is used at the image gets rebuild. Here is the rebuild concept.

These are the use cases I'd consider. For that I'd say the rule is:

If user provides an XML, the stashed XML, if any, is completely ignored.
If user does not provide an XML only stashed images including an XML are possible, otherwise it simply fails after fetching the stashed image.

All this just my thoughts and I'm happy if we can have a conversation about it

Sure lets have a conversation if there are still grey areas. JFYI next Friday I start a three weeks vacation.

My approach on merging strategy

My approach was to define a new schema (based on the artifacts of the current one, so no redefinition of elements) that is less restrictive (does not require description, neither preferences and does not consider profiles). This schema purpose would only be to define descriptions that are not complete, they are only meant to be merged on top of a real valid schema. In fact, the stashed image could be a required attribute of <image> element. So the idea is that the plugin parses using this new type of schema, collects data and applies the data into real fully valid schema found inside the image. Some thing like an structured way of modifying an XMLState instance.

My idea was pretty simple, forget about profiles and distinguish elements by two categories: (i) drop-in elements, (ii) mergeable. And only care about top level elements. For instance <packages> is a drop-in element, the merging strategy for them is dumb, I just add the whole section as there were multiple <packages> sections. Then, <preferences> would be a mergeable section, so that requires specific logic to define what merging means. I got up to the conclusion only preferences and description sections are mergeable and also thought that by removing profiles from the picture turns the idea of merging some <preferences> into something doable. I did not consider going deeper in the three, so no specfic logic merging <type>, keep the stashed one or replace with a new provided one.

My idea was to allow XMLs that could look like:

<image stashed="opensuse/jeos-live" name="appliance_on_top_of_jeos">
  <packages type="image"> <!-- This would be computed as a full new packages section added in the stashed XML -->
     <!-- a list of additional packages -->
  </packages>
</image>

or

<image stashed="opensuse/jeos-live">
   <preferences>
      <!-- This results into a replacement of the whole built `<type>`-->
      <type image="iso"....> <!-- some different type section -->
   </preferences>
</image>

So all of these I think it is reasonable as long as one considers:

the <image stashed="..."> XML is not a KIWI compliant XML, it is mapped against a new schema that is slightly different compared the original.
the stashed XML is already profiled (it does not include profiles, it is the actual digested XML used for the build)
the merging strategy is simple and only up to top level elements and it always results into a full KIWI XML that is validated against the KIWI schema.

schaefi commented 3 years ago

@davidcassany Thanks much for your feedback :+1:

Building on top of a stashed image created by KIWI and the user provides an XML. In this case the XML provided by the user takes precedence and all the project stored within the stashed image is wiped and import phase (XML, config.sh, etc.)

Building on top of a stashed image created by KIWI and the user does not provide an XML. In that case the in image XML is used at the image gets rebuild. Here is the rebuild concept.

This makes perfect sense and changes my mind. You are right let's do it that way when building based on a stash image

schaefi commented 3 years ago

My approach on merging strategy

Thanks for the details on the idea. It feels more like an extension strategy rather than a merging strategy ;) From my perspective this would be nice and can lead to really small descriptions if the stash it is based on provides a good base description to apply the "merging strategy". I also like your coding with the schema extension in kiwi_stackbuild_plugin/schema.rnc. If ok with you I create an issue adding your details provided here and if we go forward we should make sure that nothing of the existing concept code in that area gets lost

Would that be ok ?

davidcassany commented 3 years ago

It feels more like an extension strategy rather than a merging strategy ;)

Yes, extension might be better suited :)

A an issue for that would nice. Thanks

schaefi commented 3 years ago

@davidcassany ok I think I have arranged all the information into issues. I did a squash/rebase and think this PR would be ready for a merge. My next step would then be looking in your stackbuild implementation and adapt with regards to the functionality now provided by stash, adding tests and come up with a next PR.

If there is anything we might postpone for later I would create a feature branch and push it there. So making sure that nothing of your work gets lost.

Would that work for you ?

Thanks

davidcassany commented 3 years ago

Would that work for you ?

Absolutely fine

OSInside / kiwi-stackbuild-plugin

Added system stash command #1

My approach on merging strategy