PlanktoScope / forklift

Composable, reprovisionable, decentralized management of apps & configs on Raspberry Pis and other embedded Linux systems
Apache License 2.0
6 stars 0 forks source link

composition: Enable layering of pallets #253

Open ethanjli opened 1 month ago

ethanjli commented 1 month ago

Currently, the only way to make a variant of a pallet is to fork it into a new repo and then synchronize changes between the upstream repo and the downstream fork; any changes from other upstream pallets must be manually copied from those upstreams, with no support in git for synchronizing future changes from those upstreams. Making variants of pallets would be easier if we could take a "layering" approach (like how container images can be composed by copying files from multiple other container images). Prototypical motivating use-cases include:

Just like how pallets have a requirements/repositories subdirectory which is manipulated with the [dev] plt add-repo and [dev] plt rm-repo subcommands, we can add a requirements/pallets subdirectory which also uses forklift-version-lock.yml files, and we can add [dev] plt add-plt and [dev] plt rm-plt subcommands. Note that we'd probably want to provide a way to easily (check for and) switch to newly-released versions of required pallets, as a parallel to #246. This could be implemented by manually adding /requirements/pallets/{pallet path}/forklift-updates.yml files, for example.

Then we need to figure out how to organize the inclusion of files (e.g. files/directories under requirements and/or files/directories under deployments) from required pallets, e.g. with one or more of the following options:

We would probably want to enable any files provided by the pallet to override files imported at the same paths from required pallets. This would make it easy to import all files from some other pallet, and then just override a particular package deployment declaration.

We also need to figure out how to ensure that we safely handle transitive requirements among pallets (especially if we are able to include forklift-includes.yml files from other pallets), how we can prevent circular dependencies, and how we can deal with conflicting files among different pallets. For example, the simplest way to prevent file import conflicts among required pallets is by prohibiting a pallet from importing a non-directory file to the same target path from multiple distinct pallets - instead, the pallet must select which pallet that file will be imported from.

To merge all the layered pallets together, we'd probably want to make an merge filesystem which we can use as the underlay for the pallet as an overlay filesystem. We might want to export the resulting overlay filesystem separately as merged-pallet in the staged pallet bundle.

ethanjli commented 3 weeks ago

Test cases which must be handled include:

In each case, we need to handle overrides of particular files which might be in any pallet.

Thinking carefully about the tree structure of this file import problem may help make the design of the file import mechanism more rigorous with respect to combining transitive file imports from disparate pallets.

ethanjli commented 3 weeks ago

Here's a formal mathematical description of the file import-based pallet layering mechanism I am considering; the process of constructing this description helped me to design this mechanism, but it might not be very useful beyond that purpose (because we can't run it through a logic checker, and documentation should be done in English):

The overall result of this layering system should be a declarative equivalent of the imperative FROM and COPY directives for layering container images in multi-stage Dockerfiles/Containerfiles (where we can copy certain files at certain paths from previous stages to certain paths in the current stage, and we can also override those files with different file contents). The difference is that the result of a Dockerfile/Containerfile depends on the order of COPY directives within each stage of the Dockerfile/Containerfile, while my proposed system is meant to make it possible to avoid considering the ordering of file imports (by making the result independent of such ordering, by prohibiting configurations where such ordering would matter).

We can decompose IT(P)(p) and IS(P)(p) as follows:

With this design, Forklift can arbitrarily reorder the sequence of import declaration files d ∈ ∪{ID(P)(p) | p ∈ D(P)} for evaluating the import declaration files, without affecting the value of ( N'(P), F'(P) ) - just like how Forklift can arbitrarily reorder the loading of packages required by a pallet and arbitrarily reorder the loading of package deployments in a pallet without affecting the actual result of applying the pallet. And sensitivity to ordering continues to be allowed within - limited to be within - the scope of a file. For example, the sequence in which a package deployment's feature flags are applied (which determines the order in which certain Docker Compose files override other Docker Compose files, and which may determine the order in which certain file exports override other file exports) is encapsulated within the package deployment file, and the sequence of operations by which ITD(P)(p)(d) and ISD(P)(p)(d) are constructed for any given d ∈ ID(P)(p) is encapsulated within file d.