yougov / velociraptor

BSD 3-Clause "New" or "Revised" License
11 stars 1 forks source link

Ingredient and inline config should be merged, not updated #193

Open ghost opened 8 years ago

ghost commented 8 years ago

Originally reported by: Jason R. Coombs (Bitbucket: jaraco, GitHub: jaraco)


The VR config model is pretty generous when it comes to sharing config amongst applications or instances of an application through the ingredients model.

When it comes to nested config, however, VR implements a very simple model where config values are simply updated with others, so it's not possible for a swarm-level config to override a single item from a group of config, such as exemplified in this doc or for different ingredients to implement components of the same config.

This limitation creates an incentive for application developers to only ever solicit config values from top-level keys (which are merged across ingredients). For example, a recent application chose to look for config in xrefs_* keys where the value of the * was an arbitrary disambiguator and the values of all xrefs_ prefixed configs were merged and processed identically.

Ideally, VR should not be influencing the design of the application in this way.

If on the other hand, VR were to deep-merge ingredients, it would allow for less redundant overrides and more natural sharing of config across ingredients.

Following that above reference to its inspiration (lodash), I found pydash, which implements the same merge routine. It deep-merges dictionaries:

>>> from pydash.objects import merge
>>> merge(dict(a=dict(b=3, c='foo')), dict(a=dict(b=-1)))
{'a': {'b': -1, 'c': 'foo'}}

And merges lists:

>>> d1 = dict(xrefs=[{'/panels/': '...'}])
>>> d2 = dict(xrefs=[{'/questionnaires/': '...'}])
>>> merge(d1, d2)
{'xrefs': [{'/questionnaires/': '...', '/panels/': '...'}]}

I propose that VR should simply use this merge function when compiling config. In this proposal, I'm not suggesting to change the current expectation about priority of config (or the arbitrary ordering of config from selected ingredients).

@btubbs, @lbolla: Any concerns or objections?


ghost commented 8 years ago

Original comment by Brent Tubbs (Bitbucket: btubbs, GitHub: btubbs):


Ideally the application author isn't aware of config overrides/merging at all. They just get a dict.

It sounds like you've stepped away from that in order to manage/compose config a certain way. I think that's fine. The workaround described in the OP seems like a lesser evil than the proposed fix.

ghost commented 8 years ago

Original comment by Jason R. Coombs (Bitbucket: jaraco, GitHub: jaraco):


Do you have a suggestion then for solving the issue described in the OP, where the design of the config is being influenced by VR, and the subtleties of VR config merging must be considered when writing that code? Not only is the developer forced to consider this in the design, but then she has to communicate to the user deploying the app how to chose names when providing this config in order to effectively disambiguate from current and future config elements.

ghost commented 8 years ago

Original comment by Brent Tubbs (Bitbucket: btubbs, GitHub: btubbs):


I don't want the developer in my example to need to keep the subtleties of VR config merging in mind when writing that code. She should be able to treat it as a dumb dict.

I dislike this feature for much the same reasons I disliked the priority-stacked config. I highly value having the system be easily understandable by new developers and ops people with lots of other things on their minds. I don't think the increased power is worth the increased complexity in reasoning about the config and the increased risk of deploying config that you didn't intend.

ghost commented 8 years ago

Original comment by Jason R. Coombs (Bitbucket: jaraco, GitHub: jaraco):


@btubbs: So you're saying the ability to remove keys from a config in the swarm config, overriding ingredients that define those keys, is an important feature? In your example, it would be possible for the swarm config to contain

bank_backend:
  production_mode: false

or

bank_backend:
  production_mode: null

to disable production mode, which would be more explicit than simply omitting it. The code would to have been written like this for the key presence to be the only signal:

if 'production_mode' in settings.bank_backend:
  actually_transfer_money()

In that case, it would not be possible in a swarm config to override an ingredient to remove production_mode if a config ingredient defines it. I can see how that might occur, but it seems a fairly contrived case and bad design besides (especially for a cost-sensitive function). My guess is you would be hard-pressed to find a single example in production that relies on the absence of a key to signal behavior and that even in such a case, some value for that key is equivalent to the absence of that key.

IMO, the power afforded by allowing for deep merging (using a well-defined, standard implementation) outweighs the disadvantages of not being able to clear keys, a feature which would indeed be lost in the implementation of this backward-incompatible change.

ghost commented 8 years ago

Original comment by Jason R. Coombs (Bitbucket: jaraco, GitHub: jaraco):


I created vr.server pr 193 for consideration.

ghost commented 8 years ago

Original comment by Brent Tubbs (Bitbucket: btubbs, GitHub: btubbs):


-1. Sometimes apps may rely on the presence/absence of a particular config key, and this merging could have harmful unforeseen consequences in that case.

Imagine that your application has code like this in it somewhere:

if settings.bank_backend.get('production_mode'):
    actually_transfer_money()