garden-rs / garden

Garden grows and cultivates collections of Git trees ~ Official mirror of https://gitlab.com/garden-rs/garden
https://garden-rs.gitlab.io
MIT License
66 stars 9 forks source link

includes duplicates rather than overrides #14

Closed grymoire7 closed 1 year ago

grymoire7 commented 1 year ago

When using includes in the garden configuration file, I noticed that if two files are included and each contains the same command then when that command is run, it is run twice instead of once. This can be replicated by including the same config twice from different path locations. My expectation of the behavior was 'override' rather than 'duplicate'. I was hoping to use this feature for personal overrides on a department-provided configuration.

davvid commented 1 year ago

Indeed, the current implementation of garden.includes for commands are that they are additive. The concept that commands can be extended is key to how commands in templates and gardens are processed. A garden-supplied command runs after the tree-supplied commands are run, so it's currently additive on purpose in that part of its usage (trees inside gardens).

Commands at global scope are a separate concept and probably deserve special treatment.

With the current implementation I've found myself overriding particular parts of a command by using variables that trees are able to override at tree scope.

Special-casing the top-level commands block so that commands get replaced rather than extended seems like a good way for it to work.

Here's a few options. I'm curious to hear your thoughts on whether this is the way to go.


Option (X) we can special-case the commands: block defined in the top-most garden.yaml and make it so that commands defined there replace the commands that were provided via garden.includes.

This means that we would still allow duplicate commands to be defined if the user chooses to put the same command in two different garden.includes. This is simpler in implementation but less consistent.

This option allows includes to "extend" existing commands defined by earlier includes. On the other hand, it's inconsistent because using just garden.includes would be additive while also using the top-level commands: would be an override and not-additive.


Option (Y) goes even farther. There's an argument to be made that "last one wins" is a simpler concept for commands at global scope when using garden.includes. In this option the the top-level garden.yaml would be defined to be the "last", and duplicates found via garden.includes would strictly "replace" any that came before them.

If we have the same command in a.yaml and b.yaml and we have garden.includes = [a.yaml, b.yaml] then a command defined in b.yaml would override the same command from a.yaml. If we then define the same command in the top-level garden.yaml's commands block then it would also replace the command and the commands in neither a.yaml nor b.yaml would be used. It would only use the override command defined in the top-level commands block.


If there's a use case later for keeping the additive behavior then we can always add a garden.compose-commands = true boolean to bring back the current additive behavior. I'm currently leaning towards Option (Y) and making global commands follow a "last one wins" rule. It seems consistent and easy to explain.

Another question regarding includes in general is, I wonder if it'd be good to pretend like each file has #pragma once (from C++) in them and skip garden.includes that we've already seen. The alternative is to keep the config reader simple and allow the user to shoot themselves in the foot if they choose to. I'm leaning towards having it skip/ignore duplicate includes, mostly because it would prevent circular includes from happening.

I'll probably add a todo note about avoiding circular / duplicate includes for later and focus on just the override feature in this issue, though.

Let me know if you agree that Option (Y) is the way it should work.

davvid commented 1 year ago

I have a candidate implementation for this enhancement in #15. It implements Option (Y) so that the top-level commands have "last one wins" semantics.

grymoire7 commented 1 year ago

Option (Y) looks great. I also agree that a #pragma once behavior would be a nice item for the todo list. Does this change have any effect on tree level commands that are duplicated?

davvid commented 1 year ago

Thanks. It does not currently have any effect on tree-level commands. It also doesn't handle duplicate trees or other settings.. only commands defined at the top-level scope (across multiple files) can be overridden since that was what kicked off this issue.

I think in the future I'll want to make it so that this behavior is consistent for duplicate trees, variables and garden definitions. It'd be good to make it so that all of them can be overridden.

For trees I'm thinking any override behavior will have to be a full-on override. Having it do a sparse override could be a little tricky for users since we wouldn't have any way to "remove" settings from a tree we're overriding, so having it be simple and just full-on replace the previous definition seems like a sensible behavior.

That said, maybe I need to consider how to handle a sparse override for trees. I can imagine someone wanting to override a tree to replace just its URL, for example. Maybe being able to "remove" settings is less of a concern. Heh.. I think I just convinced myself that sparse overrides are better.

We already kinda support "sparse override"-like behavior for trees through both the templates: feature and the extend: feature. It's not exactly the same, though, because we would have to define a different tree to extend the base one.

I'll plan to merge these changes in shortly and include them in the upcoming v0.7.0 release.

Is the current override behavior for top-level commands going to be useful as is, or will you also need to be able to override trees in order for it to be useful for your use cases?

Nonetheless, I'll probably only include the commands override in the next release to keep the release from having too many changes in it.

Let me know whether overrides for other aspects would be helpful and which aspects would be most impactful.

grymoire7 commented 1 year ago

I totally understand limiting the existing PR to top-level commands. For now, I've refactored to not depend on garden includes to modify or even add to previously defined behaviors. That may work better to solve the problems we have today in any case.

I think in the future I'll want to make it so that this behavior is consistent for duplicate trees, variables and garden definitions. It'd be good to make it so that all of them can be overridden.

That would be great!

I think that seeing some commands execute twice might surprise some people just as it surprised me. This could be especially problematic if some of those commands are not idempotent. I can see the appeal of commands being "additive"/"extended" but I would personally prefer the default to be "override" as it seems safer and (for some at least) less surprising.

At least in theory, it would be interesting to have extend_before and extend_after directives available which would determine if the additional code would be run before or after the base command. Not sure yet how useful that would be in practice though.

commands:  # overrides
  ...
extend_before_commands:
  ...
extend_after_commands:
  ...

:shrug:

Fwiw, the use case I'm thinking about is having a department-provided config (company.yaml) which the user would include from their top-level garden config. Since this file would be subject to change and replacement, it would be easiest if the user did not edit it. When/if our users really start using this regularly, I can see them potentially wanting to "tweak" things for various reasons. That's when the includes behavior becomes important, I think. Fortunately, I think that problem is at least a little bit down the road for us.

davvid commented 1 year ago

Per the recent commits, I'm currently whittling away at making it so that everything overrides. There's a few details I'll be sorting out.

We might not be able to sparsely override subsets of a tree (the "just replace the URL" use case) in the initial version, but I think that's what I'll basically be working towards. It seems like it should be more or less straightforward to implement.

davvid commented 1 year ago

The current version in git (v0.7.0 beta2) now has consistent behavior across the board for overrides. A garden file can now override any value defined by an include file.

If we provide a complete tree definition then it will replace an earlier definition. I haven't added anything related to sparsely overriding so that'll probably come in a later release.

I'll probably tag and release v0.7.0 this weekend or early next week with the current feature set before diving into aspects related to sparse overrides. Hope that helps, and thanks again for the feedback.

davvid commented 1 year ago

I think we can call this feature truly done now. Adding support for sparse Tree overrides was really straightforward to add on top of the override behavior. It ended up being really simple (Cf. 82e6d1b4fef8c1).

This behavior is now documented and we can pretty much consider it set in stone:

https://davvid.github.io/garden/configuration.html#the-last-one-wins-rule

$ garden -V
garden 0.7.0-beta3

beta3 is looking pretty much release-ready so this is more or less what will be in v0.7.0 final.