bsdci / libioc

A Python library to manage jails with ioc{age,ell}
https://bsd.ci/libioc
Other
38 stars 11 forks source link

Thickjail Support #486

Open gronke opened 6 years ago

gronke commented 6 years ago

https://github.com/iocage/iocage/pull/604 documents the -T/--thickjail flag that can be set on jail creation so that the entire root dataset is copied from a source (the release) instead of cloning the dataset.

This feature does not exist in libiocage and needs to be evaluated to implement.

ole-db commented 6 years ago

motivation of this feature was discussed here https://github.com/iocage/iocage/issues/495

gronke commented 6 years ago

Thickjails do not seem to best the best solution to keep jails in sync across hosts. The downside of duplicating release assets does not only reflect in storage consumption but also slows down the jail creation.

There are two other approaches that I would prefer:

  1. send/receive the release to the remote host with a different name, so that multiple, same releases exist on the remote host. libiocage does not care about the data stored in a release and from the ZFS perspective
  2. The export CLI command diffs a basejail against its release and bundles only differing files to a tar archive. Importing this on the remote re-creates a jail from the local release with the same name and patch-level before it copies over the changes.

Depending on the use case the one or other is beneficial. For failover systems the first approach is preferable because you can use tools like https://github.com/zrepl/zrepl to keep data in sync. When moving load from one host to export/import approach could be the better fit.

I'm tending to deprecate the Thickjail feature in libiocage (although there are large similarities to empty jail creation).

ole-db commented 6 years ago

I would like to vote for keeping this feature.

Yes, you are right. This solution is not optimal in the point of storage consumption. But it is very handy. The whole jail with all its configuration is represented in just one dataset (with its children). And this makes it very easy to handle this datasets and the snapshots over multiple hosts.

I wrote a program I called "iocluster" which is basically a backup scheduler. With this I sync about 50 Jails over 4 nodes every 5 minutes. (I want to make it opensource, but I have to find time to clean the code)

to 1.) I don't know if this works with iocage. Downside of this would be that you have to handle the releases. And if you have many nodes, you will end up in may releases. And may be you have one node receiving a number of jails all form different hosts - then the advantage disappears.

to 2.) I don't think this would scale for many jails with high backup rate.

Best solution would be to "switch" the parent of a dataset. But I think this is not possible ...