edannenberg / kubler

A generic, extendable build orchestrator.
BSD 2-Clause "Simplified" License
154 stars 40 forks source link

Config Management #158

Open r7l opened 5 years ago

r7l commented 5 years ago

There should be a way to be able to check or control changes in config files when creating containers with Kubler. It would help to prevent containers to end up in a broken state.

In Gentoo there is etc-update which does not work when creating containers with Kubler. It will probably not be an easy task to implement anything that would resemble etc-update.

It might be possible simplify it with a function that can be called after editing configs with sed in finish_rootfs_build in order to display at least a diff between shipped and version edited with build.sh.

edannenberg commented 5 years ago

I have given this some thought, it's certainly a useful feature but I'm still not sure how to incorporate this neatly. The hard part is not the diffing but that the process is inherently something that will require manual input at some point.

In Gentoo there is etc-update which does not work when creating containers with Kubler. It will probably not be an easy task to implement anything that would resemble etc-update.

etc-update respects ROOT so this part is straight forward:

  1. save config folder/files you modified or want to watch to /config as last step in finish_rootfs_build hook
  2. in configure_rootfs_build hook copy any existing old config in image dir to $_EMERGE_ROOT/etc
  3. run etc-update in finish_rootfs_build hook and it will work as usual as ROOT is still set

We could run etc-update --preen as part of the build which auto merges trivial changes, but currently I don't see a good way to handle manual config merges.

r7l commented 5 years ago

As stated in my earlier comment, i don't think it should be interactive. An interactive process would destroy the fully automated build process.

I really like that idea with a /config directory. Maybe it would be possible to save the most recent state in a file in /var/lib/kubler/configs/ (or whatever is configured) and mount it into the build container on build time in order to compare it just prior of packing it together. From what i have in mind, it should work even better then with the etc-update process on normal system update. If you do all the edits with sed and compare and save after it, it would always compare already edited files with each other. In most instances there shouldn't be much to worry.

I think, a visible warning about discovered changes should already do it. It's up to the user then to step in and do something.

edannenberg commented 5 years ago

As stated in my earlier comment, i don't think it should be interactive. An interactive process would destroy the fully automated build process.

Indeed, the dilemma is that proper config management will require manual input at some point. For some use cases running etc-update as part of the build might be perfectly fine but it's not an option if you want to run an automated build server.

I think, a visible warning about discovered changes should already do it. It's up to the user then to step in and do something.

I don't think a notice will do, if config changed the build should at least fail. The hard part is how the further process to resolve the changes should be handled.

For now I'll probably just add a helper function to improve the sed approach for config changes. While sed is easier in terms of bit rot and upstream config changes it's not without flaws either. The helper should check if the to be changed value actually exists before updating it, if it's no longer present the build should fail.

r7l commented 5 years ago

You are right that the build should fail. But i wonder if that sed helper is actually a good approach on it's own. It will improve the ease of use of Kubler and help in a large number of instances but in some configurations there will be new settings added to the config. For someone running a certain software only within a container, it might be hard to take notice about any new lines.

r7l commented 5 years ago

Sorry for taking so long to test your latest addition. It works nice for the first few images i've updated. Will try to modify all my images to work with this and wait for any configuration change to see the effect.

I've noted 2 things:

The example in build.sh template

The description and example in the build.sh template might be misleading for new users. The example reads like this sed-or-die '^foo' 'replaceval' /etc/foo.conf. This would most likely lead to an error. The config file will not be there as it would rather be at sed-or-die '^foo' 'replaceval' "${_EMERGE_ROOT}"/etc/foo.conf

New options being added to config files over time

The other thing is that i am wondering about changes to the files that might still get lost. For example new lines / options added to configuration files. You've mentioned on Discord that it might be a problem with dates inside the files. That makes sense. What about another command that would count the number of selected files? Put it into a file similar what you do in PACKAGE.md

It could be like this compare_or_die "${_EMERGE_ROOT}"/etc/foo.conf and this would lead to a simple check for an entry in a file like COMPARE.md with /etc/foo.conf 123 (should include emerge root). It would count the number of lines and would break once the number of lines does not match from last run. If there is no entry it creates one and goes through.

edannenberg commented 5 years ago

The description and example in the build.sh template might be misleading for new users.

Fair point and shall get updated. Thanks!

...about changes to the files that might still get lost. For example new lines / options added to configuration files. What about another command that would count the number of selected files?

Again, detecting the changes is a solved problem, we can just use etc-update or dispatch-conf which will also handle trivial changes gracefully. The question is how are non-trivial changes resolved. What might work is using the timeout command:

r7l commented 5 years ago

Ok. Didn't mean to bother you with this again. I don't really see an interactive configuration as a good solution. But it might be a completely different topic anyways.

I've did some further testing and the new sed-or-die function did actually find a line that was failing and stopped. I am closing this issue for now.

edannenberg commented 5 years ago

Ok. Didn't mean to bother you with this again. I don't really see an interactive configuration as a good solution. But it might be a completely different topic anyways.

No worries, how would a non-interactive solution look like to you? Inherently this process will require manual interaction to resolve, integrating etc-update/dispatch-conf looks like the best tool for the job to me.

r7l commented 5 years ago

When doing updates of a namespace, i am usually not following the process. Usually i leave it running and check back at some point. I am not in a hurry with the builds nor am i interested to spend my time and see it working for a long time. Any interactive process would simply timeout for me for the most part or slow down any update.

I also wonder how etc-update could even work? Since any update would create a completely new container, you don't even have the old version of the file to compare it to. From my understanding etc-update would not show anything. You would have to store the configuration somewhere outside of any container as the build container might be updated and replaced as well.

I am not much of a user of dispatch-conf. Reading the Gentoo Wiki page about it, it seems to work with a repo and compares files to it. But with any diff command we would be back to the initial issue of having minor updates (like dates or versions) not covered or break the entire build process.

The sed_or_die function is great and useful anyways!

Just thought about another approach to entire things:

Checksum for specific files

How about a function that would allow you to compare a file with a given checksum. If it doesn't match it fails the build process and shows the checksum of the file in question. All you need to do then would be copy paste the shown checksum into the build.sh function to that file and rerun the process. You could simply use sha256sum and compare it to the file in question.

This would still require a few files to be manually updated in build.sh on each update run.

edannenberg commented 5 years ago

Any interactive process would simply timeout for me for the most part or slow down any update.

It would timeout and then fail the build. How is this different to a hard fail if a config change was detected?

I also wonder how etc-update could even work? Since any update would create a completely new container, you don't even have the old version of the file to compare it to. From my understanding etc-update would not show anything. You would have to store the configuration somewhere outside of any container as the build container might be updated and replaced as well.

See earlier on how the process would work, you can already do this manually to get a feel for it. Yes config would have to be saved in the image directory so it can persist, which would be not much different to maintaining your changed config manually?

I am not much of a user of dispatch-conf. Reading the Gentoo Wiki page about it, it seems to work with a repo and compares files to it. But with any diff command we would be back to the initial issue of having minor updates (like dates or versions) not covered or break the entire build process.

Me neither, however dispatch-conf does handle minor changes like dates or comments gracefully.

r7l commented 5 years ago

Maybe i don't see the bigger picture here. But assuming in a situation like you have it this image: https://github.com/edannenberg/kubler-images/blob/master/images/redis/build.sh

How would you approach the idea of having a config file stored? You're not doing it right now. The file would mostly go through with sed_or_die but you'll never know about any additional configuration option that will be added over time. That's why i had this idea with the checksum as you would do it right after finish_rootfs_build went through and check if the selected config file still matches this checksum.

On some packages with larger config files like php.ini, it might be helpful. That file is not changing on every update. But it is from time to time. I use sed_or_die on php.ini currently which will always pass as long as the options i am aiming for will stay the same.

Then there are packages like collectd. They have a new line in almost every 2nd update. Mostly i don't need those lines and ignore them other then doing etc-update for once.

I am using Kubler remote to build my images on a server. I do this in order to spare me from using a powerful machine locally. I am writing the images on my local computer, push those to git and have them pulled into the namespace directories. I don't even have Kubler installed locally. For anyone having a remote Kubler setup like i do, any changes on the configuration should not mean that people have to copy past large parts of that config into the Kubler image files. Not sure if etc-update / dispatch-conf would cover that.

Maybe we talk about this on Discord as it might be a quicker way to talk it through. But i might not have enough time up until next week.