Merging Butane configs - Githubissues

bgilbert commented 4 years ago

It's convenient for users to write multiple Butane fragments and then merge them together into one config. This is awkward right now: users must separately transpile each fragment and then use Ignition's merge directive to merge them at runtime. Or, they can transpile each Butane fragment and use a wrapper script to build a top-level Butane config that inlines the Ignition fragments using merge directives.

Provide a mechanism to include Butane fragments into another Butane config and produce a unified Ignition config as output. This might be done by extending merge to allow referencing local Butane configs, transpiling each piece, and then performing the Ignition config merge at transpile time.

dghubble commented 4 years ago

I needed to extend terraform-provider-ct with a similar ability to add FCC v1.1.0 support and retain fragment merging in https://github.com/poseidon/terraform-provider-ct/pull/63/commits/8873e4c562197ba830ae2ec22169c35d655e1aba (called snippets there). It leaves much to be desired (i.e. its not pretty).

With no fragments, fcct's Translate is used. With fragments, the main FCC content is parsed to pick an FCC/Ignition version, then each fragment is Translated and parsed into Ignition to be able to merge Ignition Config struct's that are all of the same version. For now enforcing the FCC and any fragments are on matching versions.

I'm interested in what this might look like in fcct. Which seems to have much nicer internal primitives. My strategy of falling back through the different Ignition versions, calling Parse isn't great.

tkarls commented 4 years ago

I can see two main approaches. One is to have one main FCC that references others with some sort of include. Or just extend fcct to accept multiple FCC files as input. Then the tool could merge them on a yaml level before generating the ignition file (providing they have the same version etc.).

In any case. Making it simpler to divide the FCC config into several files would be appreciated!

NickCao commented 4 years ago

Personally prefer the first approach as it makes the dependency between fcc files trackable.

Okeanos commented 3 years ago

To add to this, it would be very good if the mechanism to specify local configs to be merged didn't rely on --files-dir but a potential separate --merge-dir parameter. That way physically separating a butane config and its includes from separate (common) config snippets that are reused multiple times in other cases becomes way easier to accomplish.

bgilbert commented 3 years ago

@Okeanos Hmm, I don't see files and merged configs as neatly divided into two distinct namespaces. For example, I think there's a reasonable argument that each merged config might want its own files-dir, at which point we'd have N separate namespaces.

The general expectation is that if you have snippets that are commonly reused, you'd render each of them into a separate Ignition config (with its own files-dir), host it at a well-defined URL, and merge it via HTTPS at provisioning time. That allows the snippets to be independently updated without rerendering the parent config.

Okeanos commented 3 years ago

That thought actually occurred to me later as well; forgot to update the comment, though.

jkonecny12 commented 3 years ago

Honestly, I don't have a problem with any of these approaches, however, it would be great to have this. In general, I would like to use the Butane configs a bit as roles and playbooks in Ansible and I'm doing that but having Makefile for this is weird. I really think that this functionality should be already there.

Personally, I would go the simplest path. When the Butane found ignition -> config -> merge or similar it would first found file with the name a recursively go through all these files. Seems like the simplest and working solution to me.

cgwalters commented 3 years ago

It'd seem really natural to me to just support: butane foo.bu bar.bu > merged.ign.

bgilbert commented 3 years ago

@cgwalters We've avoided supporting that for the same reason we try to avoid command-line options affecting the semantics of the output. The instructions for assembling the final Ignition config would now reside ephemerally in your bash history, rather than persistently in a config file.

cgwalters commented 3 years ago

The instructions for assembling the final Ignition config would now reside ephemerally in your bash history, rather than persistently in a config file.

The idea is more one could easily write a Makefile to do this too.

bgilbert commented 3 years ago

Yeah, understood. Even in that case, though, the final config would now be specified in a mix of two languages/locations.

cgwalters commented 3 years ago

Also, I expect a lot of simple cases are e.g. butane --files-from . *.bu > merged.ign which is nearly simple enough to not even record in a Makefile.

bgilbert commented 3 years ago

That wildcard is problematic for another reason: the semantics of the resulting config are dependent on the merge order. The usual solution is to add sequence numbers to filenames, but users would need to know to do that.

Even if we were to support ad-hoc merging from the command line, we'd also need to support principled merging of Butane configs via Butane syntax. And it'd be a good idea to add that support first, to avoid encouraging poor config hygiene.

tkarls commented 3 years ago

@bgilbert Good point about the order! But how does the merge order affect the config? Does later entries overwrite earlier ones or is the first one kept and later "duplicates" discarded?

I tried to find documentation about the exact functionality of the merge key but cannot seem to find it.

cgwalters commented 3 years ago

The general expectation is that if you have snippets that are commonly reused, you'd render each of them into a separate Ignition config (with its own files-dir), host it at a well-defined URL, and merge it via HTTPS at provisioning time.

In my case, I have multiple machines in wildly different infrastructure (e.g. one in a public cloud, one on my home network) and I want to factor out common Ignition bits like my SSH key. They may not be able to reach a common URL, and even if they could doing this introduces a whole new level of complexity (e.g. to correctly do this you want to use verification= but if you do that, then you do need to touch each config including it when it changes).

Even in that case, though, the final config would now be specified in a mix of two languages/locations.

I'm a bit confused; are you arguing against the concept of Makefile in general? I mean my C programs are a mix of C and build rules in not-C Makefile too and I think that's been working OK :smile:

That wildcard is problematic for another reason: the semantics of the resulting config are dependent on the merge order. The usual solution is to add sequence numbers to filenames, but users would need to know to do that.

Do you have an example case in mind where someone might be depending on the merge order in a problematic way?

jkonecny12 commented 3 years ago

The instructions for assembling the final Ignition config would now reside ephemerally in your bash history, rather than persistently in a config file.

The idea is more one could easily write a Makefile to do this too.

I'm doing that even now but it's not something I would like to do really.

bgilbert commented 3 years ago

@tkarls Butane doesn't currently have any docs about config merging because Butane doesn't do the merging itself. See the Ignition operator notes for more info.

@cgwalters:

In my case, I have multiple machines in wildly different infrastructure (e.g. one in a public cloud, one on my home network) and I want to factor out common Ignition bits like my SSH key. They may not be able to reach a common URL, and even if they could doing this introduces a whole new level of complexity (e.g. to correctly do this you want to use verification= but if you do that, then you do need to touch each config including it when it changes).

Yup, to be clear, client-side merging makes a lot of sense for smaller environments. Merging independent config sources at runtime is the more general case, and might make more sense in an enterprise setting where multiple teams independently maintain configs. I was arguing specifically against supporting multiple --files-dir namespaces in a single Butane run, because I don't think it makes sense to scale client-side merging to that degree.

(--files-dir is really just a security feature to prevent configs from doing arbitrary client-side directory traversal. If multiple configs are maintained by a single person or team, it's reasonable to keep all the configs in a single Git repo, and always set --files-dir to the root of the repo.)

As an aside, in your use case you may not need verification, if TLS certificate validation is good enough for your use case. You're already trusting the cloud not to tamper with the Ignition config in userdata.

I mean my C programs are a mix of C and build rules in not-C Makefile too and I think that's been working OK :slightly_smiling_face:

Sure, but merge semantics are more subtle than object linking. An analogy might be a C program with a lot of #ifdefs. To find out which parts of the code are actually compiled and run, you might need to check the Makefile to see what -D options are being passed to the compiler. But with Butane, the #ifdefs are invisible, and the behavior of the compiled code would depend on the order that the source files are specified in the Makefile. Yes, people can learn to deal with that, but it's a footgun.

Do you have an example case in mind where someone might be depending on the merge order in a problematic way?

A hardware-specific or workload-specific config might want to override pretty much anything in a site-wide config: the contents of a config file, whether to enable a systemd unit, the size of a root or data partition. (Also, if files.append is used to append directives to a config file, the order of those directives might be semantically significant.) Ignition's merge semantics are designed to encourage child configs to override fields in parent configs (or in elder siblings), exactly so that specialized configs can inherit from base configs in this way.

jkonecny12 commented 3 years ago

A hardware-specific or workload-specific config might want to override pretty much anything in a site-wide config: the contents of a config file, whether to enable a systemd unit, the size of a root or data partition. (Also, if files.append is used to append directives to a config file, the order of those directives might be semantically significant.) Ignition's merge semantics are designed to encourage child configs to override fields in parent configs (or in elder siblings), exactly so that specialized configs can inherit from base configs in this way.

Exactly because of this I think that the Butane should benefit from the existing Ignition merge feature. What Butane should do is to look on the config and do the translation of the pointed '.bu' files in the ignition.merge directive. Basically do the translation we (at least me) are doing now in Makefile.

alvarlagerlof commented 2 years ago

Is there anything up-to-date on how to do this? I see mentions of makefiles but the current docs are in such a state around this topic that I cannot figure out how to do merging at all.

bgilbert commented 2 years ago

@alvarlagerlof Right now, the inputs to config merging are Ignition configs, not Butane configs. You can use Butane to generate both configs, but the child config referenced by the parent ignition.config.merge must be in Ignition format. The reference can be by URL or inline. For more info on config merging semantics, see here.

alvarlagerlof commented 2 years ago

@alvarlagerlof Right now, the inputs to config merging are Ignition configs, not Butane configs. You can use Butane to generate both configs, but the child config referenced by the parent ignition.config.merge must be in Ignition format. The reference can be by URL or inline. For more info on config merging semantics, see here.

Ah,

Thank you for the pointer.

jkonecny12 commented 2 years ago

After that just use the standard Makefile which will check that the file does exists and solve the issues about what changed for you.

lukasbestle commented 2 years ago

Why this is useful

I fully agree with the proposal to combine multiple Butane files into one Ignition file at build time, for two reasons:

It makes it possible to split long Butane files into more sensible snippets.
It allows to reuse Butane snippets across multiple configurations for different systems.

Both of these use cases might be quite common in the real world, so I assume that a feature that allows to import/include other Butane files will be of high value to the community, especially to those who build Butane configs that are not enterprise-level but still quite complex.

Discussion summary

To get this going, I'd first like to summarize the current state of the discussion:

There are two different possible user expectations for --files-dir:
- Either each Butane file comes with its own --files-dir namespace. This is useful in larger deployments where different Butane/Ignition configs are maintained separately and linked to each other. But in these cases this whole feature of merging Butane configs locally at build time is not really relevant as the configs are built separately anyway.
- Or there is a global --files-dir namespace for the whole build that will be used for all involved Butane files. This is for example how Ansible does it and useful for setups where the whole Ignition config gets built without dependencies on external config. This is also the use case where this feature of merging multiple files is missing in Butane.
Merge order is important as multiple rules may override each other.
An explicit syntax for child configs in Butane YAML is preferred over CLI arguments as Butane syntax makes the setup more clear and explicit.
There are two approaches for this feature: import/include and merge. merge is already supported by Ignition and could be used with the same syntax at Butane build time. import/include on the other hand allows to insert YAML snippets at arbitrary levels. Both of these approaches are useful in the real world and one cannot be replaced with the other.

Concept

An idea how to implement both approaches (independently from each other as I wrote above):

`import/include` and `merge` via YAML tags

This could be done with a YAML tag like this:

variant: fcos
version: 1.1.0
passwd:
  users: !include 'config/users.yml'

Butane will load that other YAML file and insert the YAML structure as a child node right where it was included. The path will be resolved relative to --files-dir by the same code that is also used for files, trees etc.

A limitation of this syntax is that it's not possible to extend the imported config using native YAML syntax. The following example causes a YAML syntax error:

variant: fcos
version: 1.1.0
passwd:
  users:
    !include 'config/users.yml'
    - name: core
      ...

It could be done in theory by modifying the YAML parser, but then the Butane files wouldn't be valid YAML anymore. An alternative for valid YAML syntax would be a separate !merge tag that merges itself with its parent node (= the referenced file is included like with !include but inserted one level upwards):

variant: fcos
version: 1.1.0
passwd:
  users:
    - !merge 'config/users.yml'
    - name: core
      ...

Native Ignition `merge`

Because of the two different expectations for --files-dir and the two underlying use cases (see above), Butane needs to have two operating modes:

Default mode (current behavior): Butane leaves merge declarations untouched and lets Ignition resolve them at runtime.
Recursive mode (new --recursive flag): Butane resolves all merge declarations recursively according to the operator notes and returns a merged Ignition JSON. Butane will have the following behavior in this mode:
- The --files-dir is used for all Butane files, including child configs. It is not possible to define a separate --files-dir per Butane file.
- If a child config refers to another local child config (if the child config again uses merge) or to a local file or tree, this path is resolved from global the --files-dir just like in the parent config.
- Child configs that were loaded via source (URL) are treated as external and may not refer to local child configs, files or trees. They may only use source or inline.

dghubble commented 2 years ago

My original aim was just for github.com/coreos/butane (the Go package) to add the Merge function for a slice of Butane snippets. It can be done (albeit painfully) by external packages (example), but would be much better within the butane package.

This would allow Butane tools to implement merging the same way. Phrased another way, we need to agree on function that can merge a list of butane snippets. Discussions about specific flag-based tools or how to expose the feature follow from that.

lukasbestle commented 2 years ago

My original aim was just for github.com/coreos/butane (the Go package) to add the Merge function for a slice of Butane snippets.

To be honest I can't follow. Isn't this the repo for the butane Go package (and also user-facing tool)?

Phrased another way, we need to agree on function that can merge a list of butane snippets.

Do you mean purely from the technical perspective? I.e. Butane gets two or more YAML structures to merge (however it may have gotten them) and needs to output a single YAML structure? I'd say the algorithm for this should be exactly the same one as in Ignition, provided we do use the same syntax in the end.

This leaves us with importing/including, which is a related feature but doesn't need the algorithm for config merging.

dghubble commented 2 years ago

Given multiple Ignition []byte contents (regardless of where they originally came from), github.com/coreos/ignition/config does not yet provide a Merge function to output a single Ignition document (absent). Individual package versions do provide a Merge function (e.g. github.com/coreos/ignition/config/v3_3), but there is not a top-level Merge to handle version introspection, etc. And handling versions correctly is important for merging. That was explored in https://github.com/coreos/butane/pull/120 as an improvement over folks having to do it themselves.

Once that is in place, Butane (the package, its users, and the cli tool) could implement a similar Merge functionality, that handles the usual Butane YAML encode/decoding. Potentially introducing new syntax into Butane to help specify the content sources seems further out when the core function is missing.

lukasbestle commented 2 years ago

For those who want to use including and merging right away, I have created a Makefile that uses yq to implement the first part of my suggestion:

###
# Copyright: 2021 Lukas Bestle
# License:   https://opensource.org/licenses/MIT
###

files := $(shell find files -type file)
yaml  := $(shell find . -name '*.yml')

# Final build step
dist/ignition.json: $(yaml) $(files) dist/butane.bu
    butane -d files dist/butane.bu -o dist/ignition.json

# Combines all YAML files into the merged Butane YAML
# Each merging pass resolves the `!include` and `!merge` tags:
# `!include` replaces the tag with the referenced file contents
# `!merge` merges the parent with the referenced file contents
# Multiple passes are used to resolve recursive includes/merges
dist/butane.bu: $(yaml) dist
    cp main.yml dist/.butane.bu

    for number in 1 2 3; do \
        echo "Merging pass $$number"; \
        yq eval '(.. | select(tag == "!include")) |= load(.)' -i dist/.butane.bu; \
        yq eval 'with(.. | select(tag == "!merge"); parent = (parent *+ load(.)) | del(.))' -i dist/.butane.bu; \
    done

    mv dist/.butane.bu dist/butane.bu

# Creates the dist folder if it doesn't exist
dist:
    mkdir -p dist

# Deletes all dist files
.PHONY: clean
clean:
    rm -r dist

# Spins up a temporary HTTP server to serve the ignition config
.PHONY: serve
serve: dist/ignition.json
    cd dist; python3 -m http.server

Usage

The Makefile assumes the following directory structure:

dist/
  butane.bu
  ignition.json
files/
  your-files-and-trees
main.yml
Makefile
your-custom-structure/
  groups.yml
  users.yml
  ...

Here's an example for the YAML syntax you would use:

variant: fcos
version: 1.4.0
passwd:
  groups:
    !include your-custom-structure/groups.yml
  users:
    - !merge your-custom-structure/users.yml
    - name: core
      groups:
        - wheel

Output:

variant: fcos
version: 1.4.0
passwd:
  groups:
    - name: test
  users:
    - name: core
      groups:
        - wheel
    - name: user1
      ...

carlocorradini commented 2 years ago

For those who want to use including and merging right away, I have created a Makefile that uses yq to implement the first part of my suggestion:

###
# Copyright: 2021 Lukas Bestle
# License:   https://opensource.org/licenses/MIT
###

files := $(shell find files -type file)
yaml  := $(shell find . -name '*.yml')

# Final build step
dist/ignition.json: $(yaml) $(files) dist/butane.bu
  butane -d files dist/butane.bu -o dist/ignition.json

# Combines all YAML files into the merged Butane YAML
# Each merging pass resolves the `!include` and `!merge` tags:
# `!include` replaces the tag with the referenced file contents
# `!merge` merges the parent with the referenced file contents
# Multiple passes are used to resolve recursive includes/merges
dist/butane.bu: $(yaml) dist
  cp main.yml dist/.butane.bu

  for number in 1 2 3; do \
      echo "Merging pass $$number"; \
      yq eval '(.. | select(tag == "!include")) |= load(.)' -i dist/.butane.bu; \
      yq eval 'with(.. | select(tag == "!merge"); parent = (parent *+ load(.)) | del(.))' -i dist/.butane.bu; \
  done

  mv dist/.butane.bu dist/butane.bu

# Creates the dist folder if it doesn't exist
dist:
  mkdir -p dist

# Deletes all dist files
.PHONY: clean
clean:
  rm -r dist

# Spins up a temporary HTTP server to serve the ignition config
.PHONY: serve
serve: dist/ignition.json
  cd dist; python3 -m http.server

Usage

The Makefile assumes the following directory structure:

dist/
  butane.bu
  ignition.json
files/
  your-files-and-trees
main.yml
Makefile
your-custom-structure/
  groups.yml
  users.yml
  ...

Here's an example for the YAML syntax you would use:

variant: fcos
version: 1.4.0
passwd:
  groups:
    !include your-custom-structure/groups.yml
  users:
    - !merge your-custom-structure/users.yml
    - name: core
      groups:
        - wheel

Output:

variant: fcos
version: 1.4.0
passwd:
  groups:
    - name: test
  users:
    - name: core
      groups:
        - wheel
    - name: user1
      ...

We really need the merging feature... 😅

Thanks @lukasbestle for sharing your work!

LorbusChris commented 2 years ago

Here's another example of how to (naively) merge Butane config snippets: https://github.com/LorbusChris/butane-config-template

carlocorradini commented 2 years ago

Here's another example of how to (naively) merge Butane config snippets: https://github.com/LorbusChris/butane-config-template

So useful! Thanks!

bgilbert commented 2 years ago

@dghubble Hmm, I'm not sure I understand the API issue. On the Ignition side, there are currently a couple ways to merge two byte slices containing Ignition configs:

Encode each config into a data URL and put it into an ignition.config.merge.source in a newly-created parent Ignition config. This is late-binding, in that the actual merging will be performed by Ignition at provisioning time. It also incurs some size overhead, which might be relevant when userdata size is limited.
Parse each Ignition config into the spec version of your choice using ParseCompatibleVersion, and then use the Merge function for that version. This is early-binding, which is generally nicer, but limits the caller to supporting only spec versions that were stable when the caller's code was last updated.

Both approaches support fragments with mixed spec versions. We've used both in different places, depending on the situation; e.g. the first one can be implemented in non-Go code. The first one is probably too obscure to add as a helper function in Ignition, and the second one doesn't seem worth a helper because it's basically two function calls. Maybe I'm missing something though?

On the Butane side, I think programmatic merging will happen for free as part of the user-facing implementation. The caller will be able to create a Butane config struct with the appropriate merge directives (which may mean we should support both inline and local) and run it through ToIgn* in the usual way. If that turns out to be too finicky, we can always create a helper function. But I'd want to make sure we design primarily for the user experience, rather than starting with the external API.

In any event, let's keep this issue focused on user-facing config merging. If you'd like to continue discussion of the API side, feel free to open a separate issue.

bgilbert commented 2 years ago

@lukasbestle Thanks for the writeup and proposal!

Import/include and merge via YAML tags

I have substantial reservations about this approach.

Supporting arbitrary inclusions requires a preprocessing stage that'd be more complex to implement.
Included fragments wouldn't be full Butane configs that could be parsed, just snippets of YAML text.
Similarly, the parent config couldn't be parsed as a valid Butane config in its own right, without preprocessing it first. As a result, we'd either need to forgo representing inclusions in Butane config structs (which might be programmatically generated), or every field that supported inclusions would no longer be literal, and the caller would need to think about escaping etc.
As far as I can tell (correct me if I'm wrong!) this doesn't add any new capabilities over the merge approach. Merge semantics generally allow a child config to edit the fields of a parent struct. (Config merging isn't perfect; for example, items can't be removed from lists. But I don't think inclusion would allow that either.)
We try to keep the general "feel" of the Butane UX similar to Ignition. Having two separate inclusion designs with different semantics doesn't feel great.

Native Ignition merge

I think it makes more sense to extend the existing merge semantics to Butane configs. But adding a --recursive flag has a similar problem as listing merge sources on the command line: the rendered config would behave differently depending on external parameters to the Butane compiler. (For example, if Matchbox found a Butane config on disk, it wouldn't know whether to pass the flag.) Instead, I think we should add parallel merge/replace fields for Butane child configs. For example:

variant: fcos
version: 9.9.9-experimental
ignition:
  config:
    merge:
      - local_butane: child.bu
      - inline_butane: |
          variant: fcos
          version: 1.1.0
          [...]

I'd think users would mostly use local_butane. inline_butane could be useful for programmatically-generated config structs.

As you point out, source_butane has security implications and would need to disable local file references in the child. I'm not excited about adding additional sandboxing requirements, and wonder whether we can just omit that field for now.

I agree that in all cases, child configs should use the same --files-dir as the parent.

Another option:

variant: fcos
version: 9.9.9-experimental
ignition:
  config:
    merge:
      - butane: true
        local: child.bu
      - butane: true
        inline: |
          variant: fcos
          version: 1.1.0
          [...]

Early or late binding

To be completely consistent with Ignition config merging, we'd need to recursively render child configs to Ignition and include them as data URLs ("late binding"). That allows Ignition itself to handle the actual merging at runtime, and allows child configs to use a newer config spec than the parent. It's also space-inefficient and awkward, and the output is difficult to manually inspect. (I ended up building this for debugging coreos-installer iso customize, which sometimes wraps Ignition configs in Ignition configs in Ignition configs.)

If we want Butane itself to handle the merging ("early binding"), Butane would recursively evaluate the children, evaluate the parent without the merge directives, and then merge the two together. Child configs couldn't have a newer spec version than the parent, since the parent would presumably define the output Ignition config version, and the newer spec might contain fields that are unrepresentable in the older one. We'd also need to think about validation semantics: if a child fcos config includes fields that would be forbidden if they were in the parent openshift config, should we fail?

Arguably, early binding is a separable issue, since we might conceivably want to support it even for the current Ignition-centric semantics of ignition.config.merge. So one approach is to skip it for now, and possibly implement an ignition.config.merge.early_bind flag field in a second pass. That's also awkward, though, since we'd be delegating a relatively obscure decision to the user.

lukasbestle commented 2 years ago

Thanks for sharing your thoughts!

Supporting arbitrary inclusions requires a preprocessing stage that'd be more complex to implement.

Fair enough. :)

Included fragments wouldn't be full Butane configs that could be parsed, just snippets of YAML text.

Similarly, the parent config couldn't be parsed as a valid Butane config in its own right, without preprocessing it first. As a result, we'd either need to forgo representing inclusions in Butane config structs (which might be programmatically generated), or every field that supported inclusions would no longer be literal, and the caller would need to think about escaping etc.

As far as I can tell (correct me if I'm wrong!) this doesn't add any new capabilities over the merge approach. Merge semantics generally allow a child config to edit the fields of a parent struct. (Config merging isn't perfect; for example, items can't be removed from lists. But I don't think inclusion would allow that either.)

I'm combining these three points in my reply as they are related. I feel like merging full Butane configs and importing snippets are features for two very distinct use cases.

One major use case for importing snippets is a "configuration template" that can be shared as a Git repo. So you would have a main.bu file that contains the general structure and from there you would import snippets for specific parts that can be customized by the user of the template. In this use case, merging full configs doesn't make much sense as each snippet would need to use the full Butane document structure even though it maybe only defines a list of custom users etc.

Keep in mind that snippet importing is much less complex regarding the behavior and handling. E.g. the Butane version issue you describe for config merging wouldn't be an issue here as the preprocessor can rightfully assume that the imported snippet uses the same Butane version. I'm generally a huge fan of solutions with low complexity and a huge impact. This doesn't mean that importing can replace config merging (it really can't), I just think it would be a simple and powerful solution for this use case.

We try to keep the general "feel" of the Butane UX similar to Ignition. Having two separate inclusion designs with different semantics doesn't feel great.

OK, that makes sense. Maybe it's a question of expectations for the Butane tool. When I read that there is a tool to generate Ignition configs, I had really thought that it included all sorts of convenience functions to make the life working with Ignition configs easier. But if I understand you correctly, Butane is really just meant as a simple low-level tool to convert YAML to JSON and validate the document structure. Of course it does more internally, but that's how it feels from the user perspective. To be honest I feel like that's a bit of a lost opportunity: There is already ignition-validate, so if I really want to, I can reproduce a large part of the user experience without Butane. But well, that's a meta discussion that probably does not belong into this issue.

Child configs couldn't have a newer spec version than the parent, since the parent would presumably define the output Ignition config version, and the newer spec might contain fields that are unrepresentable in the older one.

Why is that? I'd say that the output should use the newest version of all used Butane configs. This is the only way to represent all config data. And if a newer version ever removes a feature from a previous version, there's likely a conflict anyway and Butane should fail.

Or can there be a case where a user would deliberately want to choose an older version for their parent config even though they want to merge in configs with a newer version?

We'd also need to think about validation semantics: if a child fcos config includes fields that would be forbidden if they were in the parent openshift config, should we fail?

Yes, please. If Butane doesn't fail, Ignition will and that won't help anyone.

I'd even say that Butane should always fail if it encounters configs from different Butane variants (reduces complexity and avoids bugs in weird edge cases). This restriction excludes use cases where one would want to use a global base config for multiple Butane variants. But I really wonder if this use case is even viable without edge cases.

Arguably, early binding is a separable issue, since we might conceivably want to support it even for the current Ignition-centric semantics of ignition.config.merge. So one approach is to skip it for now, and possibly implement an ignition.config.merge.early_bind flag field in a second pass. That's also awkward, though, since we'd be delegating a relatively obscure decision to the user.

I agree. It all gets incredibly complex. I'm worried it will be very hard to understand and especially to debug in the end.

ghost commented 2 years ago

Allow me to give a bigger picture for this.

Butane is (becoming) another provisioning/configuration management tools, à la Ansible, Chef or puppet.

Contrary to these tools, it lacks essential features, including the "merge" of different configuration files, because code de-duplication:

In Ansible, we uses roles (backend: ansible-galaxy) to use in playbooks.
In Chef, we uses recipes (backend: chef-server/automate) to use in cookbooks.
In Terraform, we uses modules (backend: Terraform registry) to use in other modules.
In Python, we uses pip (backend: pypi) to use in python scripts.
PHP: composer with packagist, Java: gradle with maven, etc, etc.

For now, Butane lacks this kind of “library dependency system”:

This issue but also…
…the backend to enable it - like for example, a very simple implementation: get dependency libraries from git.
Variables. Being able to merge butane files will be quickly limited without variables (https://github.com/coreos/butane/issues/111), not only for secrets, but also dynamic values: server configuration moved from static to dynamic. And now, from dynamic to abstracted (IPs changes all the time, naming conventions evolve, server are treated as cattle, security becomes real-time…). How can we managed abstracted machines with a static provisioning tool?
…Without workaround. With things like manual variables injection or makefiles - it becomes very hard to explode a butane YAML to its relevent subparts: unit files, configurations files, scripts… When everything is in the butane YAML file, how do you lint and unit test bash scripts? How to you check configuration file syntax?
Documented good practices!

I think what is happening with butane/ignition and coreOS is very exciting! Although, I believe Butane is now the heart of coreOS adoption for DevOps and Sysadmins, but compared to other provisioning tools, it’s still a bit dry.

Can you consider not only this feature, but plan for the need to have a proper library management system for Butane?

bgilbert commented 2 years ago

Let's keep this issue focused on config merging UX, please.

@gui-don, I've moved your comment to a separate issue #301.

bgilbert commented 2 years ago

@lukasbestle Thanks for the response!

One major use case for importing snippets is a "configuration template" that can be shared as a Git repo. So you would have a main.bu file that contains the general structure and from there you would import snippets for specific parts that can be customized by the user of the template. In this use case, merging full configs doesn't make much sense as each snippet would need to use the full Butane document structure even though it maybe only defines a list of custom users etc.

I agree with the "configuration template" use case, but disagree that merging full configs doesn't make sense here.

To make this more concrete, I've included some examples at the bottom of this comment: a unified config that creates a file and two users, the same config broken out using inclusion as I understand the proposal, and the same config broken out using Butane config merging.

In my view, the merged configs are substantially cleaner than the included ones. The intentions of the child configs are clear at a glance; the child configs declare their variant/version and thus have well-defined semantics; and the extra boilerplate is fairly trivial. Also (not shown in the example) it's possible for a single child config to declare a related user, systemd service, and file, without switching to a different config format (from inclusion to a Butane child config). And in the parent config, the child configs are all listed in one place, rather than scattered through the file.

Keep in mind that snippet importing is much less complex regarding the behavior and handling. E.g. the Butane version issue you describe for config merging wouldn't be an issue here as the preprocessor can rightfully assume that the imported snippet uses the same Butane version.

That's okay for the example you gave. However, if the feature existed, it would surely be used in contexts where the parent and child configs are managed independently. In that case, bumping the version of the parent config could change the semantics of the child — possibly even invalidating it. (Semantic changes are allowed in major spec version bumps.) There'd be no metadata in the child indicating which spec it was written for, so a human would need to manually fix up the child based on knowledge of the relevant spec versions. I don't think saving a few lines is worth the ambiguity.

Butane's predecessor ct (the Container Linux Config Transpiler) used unversioned config files, which essentially made it impossible to change the semantics of existing fields. That's why Butane is so rigorous about versioning.

Maybe it's a question of expectations for the Butane tool. When I read that there is a tool to generate Ignition configs, I had really thought that it included all sorts of convenience functions to make the life working with Ignition configs easier. But if I understand you correctly, Butane is really just meant as a simple low-level tool to convert YAML to JSON and validate the document structure.

Butane is indeed intended to be a higher-level tool, with convenience functions etc. But it's also aligned with the philosophy of the rest of the provisioning stack (Ignition/Afterburn) and to some degree the rest of FCOS, which favors small, correct, opinionated tools rather than adding every possible feature. IMO that's consistent with adding useful config merging, but in a form that favors explicitness and correctness.

Child configs couldn't have a newer spec version than the parent, since the parent would presumably define the output Ignition config version, and the newer spec might contain fields that are unrepresentable in the older one.

Why is that? I'd say that the output should use the newest version of all used Butane configs. This is the only way to represent all config data. And if a newer version ever removes a feature from a previous version, there's likely a conflict anyway and Butane should fail.

There are a couple things tied up here. On the one hand, it's convenient to allow parent and child to be versioned independently. On the other hand, we currently have a well-defined mapping from Butane spec version to Ignition spec version. That's mainly important because configs need to be able to set an upper bound on the output Ignition version, since they may be targeting OSes with older versions of Ignition.

So actually, I think it would be okay to loosen that requirement, as long as we don't spontaneously generate a config version newer than any of the merge inputs. We could emit the max version of all merged configs, or we could have the parent config version limit the versions of its children. The latter is more explicit and also somewhat harder to use.

Example unified config

variant: fcos
version: 1.4.0
storage:
  files:
    - path: /a
      contents:
        inline: hello world
passwd:
  users:
    - name: user1
    - name: user2

Example inclusion

Parent config

variant: fcos
version: 1.4.0
storage:
  files:
    - !include 'file.yml'
passwd:
  users: !include 'users.yml'

file.yml

path: /a
contents:
  inline: hello world

users.yml

- name: user1
- name: user2

Example config inclusion

Parent config

variant: fcos
version: 1.4.0
ignition:
  config:
    merge:
      - local_butane: file.bu
      - local_butane: users.bu

file.bu

variant: fcos
version: 1.4.0
storage:
  files:
    - path: /a
      contents:
        inline: hello world

users.bu

variant: fcos
version: 1.4.0
passwd:
  users:
    - name: user1
    - name: user2

lukasbestle commented 2 years ago

@bgilbert Thank you, that is entirely convincing. I agree that config merging is the best way forward also for this use case.

So actually, I think it would be okay to loosen that requirement, as long as we don't spontaneously generate a config version newer than any of the merge inputs. We could emit the max version of all merged configs, or we could have the parent config version limit the versions of its children. The latter is more explicit and also somewhat harder to use.

Both ways make sense, but considering:

That's mainly important because configs need to be able to set an upper bound on the output Ignition version, since they may be targeting OSes with older versions of Ignition.

I'd say that the safest way would be to always use the Butane version as defined in the parent config. Child configs with a lower version would be upgraded. If a child config uses a higher version than the parent config, ideally the following would happen:

If the data in the child config can be entirely represented in the lower parent Butane version, "downgrade" the child config to the parent version and then merge them.
If the child uses config that is not available in the lower parent version, throw a fatal error with the explanation that the parent config version needs to be at least X because of child config Y.

In case such a compatibility check between different Butane versions and the downgrade are not easily possible, the alternative would be to always throw a fatal error if the child config version is higher. With a good and specific error message this would still be easy to fix by the user and the compatibility check with downgrade could still be added as an enhancement later.

bgilbert commented 2 years ago

Yeah, I generally agree. If we start with the most restrictive model (fail if the child version is newer than the parent), we can always loosen it later.

Note that there's no mechanism for version downgrades (other than ign-converter, which is unsupported), and no mechanism for upgrading/downgrading Butane configs at all. Any version translation would be applied to Ignition configs after they've been transpiled. The translation code is maintained by hand, and I don't think we should add translations that are only used in certain corner cases; rules will inevitably be missed.

dghubble commented 2 years ago

Parse each Ignition config into the spec version of your choice using ParseCompatibleVersion, and then use the Merge function for that version.

Thanks @bgilbert, I missed this addition. It handles the case I had for merges of varying configs via the API.

bgilbert commented 2 years ago

https://github.com/coreos/butane/issues/301#issuecomment-1126639220 discusses a use case where child configs would like to have a different --files-dir (or at least a different base directory for path resolution) than the parent.

bgilbert commented 2 years ago

We could support reading a child config in either of two modes:

Read a specified Butane config, using the same files-dir as the parent.
Read a specified "module", which is a a directory containing a module.bu, where the directory is also the files-dir for the child.

We could distinguish these implicitly by checking whether local_butane points to a directory, or explicitly by defining a second field e.g. local_module.

MLNW commented 10 months ago

This feature would be extremely beneficial for those of us working with infrastructure as code, where modularity, reusability, and templating are key practices. Implementing this would greatly enhance Butane's utility by allowing us to construct more modular and maintainable configurations that can be easily templated and reused across various projects and environments.

nveeser commented 2 weeks ago

So I had some free to play with open source tools and got stuck on this issue. I read through much of this discussion from a couple of years ago.

I made a package that works for me for now but it surfaced a number of issues that will probably be useful to think about long term?

https://github.com/nveeser/butanex

Summary (as I understand it)

Users value some form of config parameterization and/or composition to allow building an Ignition file from multiple composable "providers" (ie Terraform modules, etc). Two different paths are "import" and "template". This bug is about "import".

As with all abstractions there is a tension. The more sophisticated you make the set of tools, the easier it is to build something really complex which is very hard to reason about ("why does only host Y get storage.files.foo to X, it should be Y").

Early / Late binding

This could be implemented in Ignition (aka late binding) or in Butane before transformation (aka early binding). The latter makes Ignition configs easier to read and has better ergonomics and possibly has other benefits

Versioning

Butane files are versioned which allows the implementation to change semantics while allowing users to migrate intentionally. With multiple files together there is an important challenge of merging the same object from two files where the objects have different semantics.

Notes

Working through this I found a few corner cases but I suspect there are more. Especially since I have no experience with tools like Terraform and how that could be used to partition this problem. Here are some notes.

Mapping Nodes - Maps vs Structs

This operates on generic map[string]any, not the schema types. Merging is pretty straightforward with maps but some signal is lost. I looked into parsing to the variant specific schema.Config types, however there is no easy mapping, as the only mapping I found was translating directly to Ignition JSON bytes.

Perhaps the translater interface could be an interface type rather than a function type?

Local Paths

I made the choice to update relative paths (ie Resource.Local) where possible. When merging a file, the local paths in that file are updated to be relative to the new file based on the existing one. This may make assumptions on the underlying filesystem layout.

Merge Versions

Given a.yaml at 1.6.0 and b.yaml at 1.7.0, what happens when a specific node has changed semantics (a scalar changed to a mapping)? Again merging using the go structs from a given schema package is probably better for determining this than operating on the generic map[string]any types.

Configurable semantics

The challenging with merging config trees like this is subtle details like sequences. For example when merging two Sequences, does the source sequence or replace destination sequence. Sometimes you want one file to contain defaults and the other file to overwrite the defaults. I suspect the semantics can be per file or per tag?

For tag specific behaviors one could store them on struct tag of the schema struct. For file-level behaviors this could be a command line argument or a node with tags in the Butane YAML itself? Likely some user will come asking to fine-tune that for the specific use case. Again the challenge is addressing needs without handing out foot-guns.

I added a config with patterns that can be applied to parts of the tree. If the pattern matches the context path, apply that behavior (overwrite, replace filepaths, etc)

YAML Sequence ambiguity

If I understand the spec YAML is permissive on Sequence types. It's possible to write a YAML key that only contains a single element; if the target is a sequence type (aka slice), then YAML (at construction?) will add single element to the list. Looking at the file there is no way to determine if that field is a list field or not.

Examples

users:
  ssh_authorized_keys: "key1"

users := map[string]string{ 
  "ssh_authorized_keys": "file" 
}

users: 
  ssh_authorized_keys: 
    - "key1"
    - "key2"

users := map[string][]string{
    "ssh_authorized_keys": []string{
         "key1",
         "key2",
    }
}

Merging two files which use both for the same key is challenging to get right? (I guess sequence should always win?). I did not solve this issue here.

My guess is that schema.Config structs are the better choice long term than map[string]any for correctly merging and providing the user with clear feedback for conflicts or issues.

coreos / butane

Merging Butane configs #118

Why this is useful

Discussion summary

Concept

`import/include` and `merge` via YAML tags

Native Ignition `merge`

Usage

Usage

Import/include and merge via YAML tags

Native Ignition merge

Early or late binding

Example unified config

Example inclusion

Parent config

file.yml

users.yml

Example config inclusion

Parent config

file.bu

users.bu

Summary (as I understand it)

Early / Late binding

Versioning

Notes

Mapping Nodes - Maps vs Structs

Local Paths

Merge Versions

Configurable semantics

YAML Sequence ambiguity

Examples

coreos / butane

Merging Butane configs #118

Why this is useful

Discussion summary

Concept

import/include and merge via YAML tags

Native Ignition merge

Usage

Usage

Import/include and merge via YAML tags

Native Ignition merge

Early or late binding

Example unified config

Example inclusion

Parent config

file.yml

users.yml

Example config inclusion

Parent config

file.bu

users.bu

Summary (as I understand it)

Early / Late binding

Versioning

Notes

Mapping Nodes - Maps vs Structs

Local Paths

Merge Versions

Configurable semantics

YAML Sequence ambiguity

Examples

`import/include` and `merge` via YAML tags

Native Ignition `merge`