tr11 / python-configuration

A Python library to load configuration parameters
https://tr11.github.io/python-configuration/
MIT License
84 stars 24 forks source link

Inconsistent configs loading #63

Open velykanov opened 3 years ago

velykanov commented 3 years ago

Hello everyone!

I've found an interesting behaviour. Let's say you have the following config file:

[APP_NAME]
foo = bar

[ANOTHER_APP]
spam = egg

and also env var: APP_NAME_EASTER=egg

Let's also say that you want to load all configs for your APP_NAME application. You'd do something like this:

configs = config('config.ini', 'env', prefix='APP_NAME', separator='_')
assert configs == {'APP_NAME': {'foo': 'bar'}, 'EASTER': 'egg', 'ANOTHER_APP': {'spam': 'egg'}}

As you may see here env var is not a part of APP_NAME config, it's something separate That happens because of this thing https://github.com/tr11/python-configuration/blob/master/config/__init__.py#L188 - env var namespace is simply erased

I think it's loss of information and even if we want to clear env var name from prefix then this prefix still must be there on higher level, e.g. {self._prefix: result} With this approach EASTER would be a part of APP_NAME config

tr11 commented 2 years ago

That's interesting, but I'd say this is working as intended since the INI files config doesn't use prefixes. The env prefix essentially tells the env parser to read only the EASTER part.

It think the result you intended would be obtained by the same ini file, the env variable PREFIX__APP_NAME__EASTER=egg, and the configs

configs = config(f.name, "env", prefix="PREFIX", separator="__")

This way the PREFIX__ portion is ignored, and the env variable is read as APP_NAME.EASTER, resulting in it getting merged to the APP_NAME dict.

velykanov commented 2 years ago

I guess my question is more about why prefix is being thrown away I don't believe this is right behaviour

I agree that INI files don't use prefixes, but they use sections and in case section is named as env var prefix (or vice versa) it's somewhat logical that they all relate to the same thing, isn't it?

tr11 commented 2 years ago

Ignoring the prefix for ENV variables was an initial design decision. The reason for it was that we have to filter the environment (by using a prefix) so we don't read the whole environment, and in general we couldn't guarantee that there would be no clashes between the services. For example, this allowed us to put two services together in the same box without having to change env variables within the services. We had a DATABASE env variable that was distinct between the two services and this allowed us to define two env variables, SERVICE1DATABASE and SERVICE2DATABASE without having to change the internals

velykanov commented 2 years ago

Ok, let's say that sounds logical (but I still think there's an information loss. Despite using configs['DATABASE'] inside both SERVICE1 and/or SERVICE2 you still need to pass that service name as a prefix, right? So there will be some "service-aware" part of the code)

Going back to my example of having INI file where each section represents application/service configuration and config function allows you to put several configs sources, which is a great mechanism to override configurations based on their priority (order). Current implementation doesn't allow me to do that which doesn't make any sense. So it would work in case my INI file contains configs without sections, but that wouldn't allow me to keep single INI file per different services.

Moreover, let's do the same with SERVICE1 and SERVICE2, and let's assume that env variables have higher priority than config INI file (so we would read configs like configs = config('config.ini', 'env', prefix='SERVICE1', separator='__')):

config.ini

[SERVICE1]
DATABASE = example.com/db1

[SERVICE2]
DATABASE = example.com/db2

env:

export SERVICE1__DATABASE=example.com/env_db1

Instead of calling function once for SERVICE1 and enjoy DATABASE value to be example.com/env_db1 I must perform extra actions like unpacking dicts or calling function twice (and again in both cases I'd need to organise my code in specific order to provide prioritization, which config function already has inside - logic duplication, isn't it?)

tr11 commented 2 years ago

Would your use case be covered by passing the prefix to the section_prefix parameter on the INI files?

velykanov commented 2 years ago

Partially yes

There are 2 problems remain:

  1. There's no possibility to pass kwargs for different configuration loaders through config function (yet?)
  2. As we would load data from INI file by a single section only that wouldn't allow us to have other sections within the same call (if I understand things correctly)

Let's say you have an application which spins other ones on demand (let's say you're spinning python environments per users requests and java environments) + the logic with env vars remains Each of those envs should have it's own config, i.e.:

[PYTHON_ENV]
...

[JAVA_ENV]
...

And our main application should perform single configs read at the beginning and use those to spin up envs With what you're suggesting that would look like:

configs = {
    'python_env': config('config.ini', 'env', prefix='MAIN_APP', separator='__', section_prefix='PYTHON_ENV'),
    'java_env': config('config.ini', 'env', prefix='MAIN_APP', separator='__', section_prefix='JAVA_ENV'),
}

but of course, that would mixin env vars into configs

tr11 commented 2 years ago

I was thinking that the prefix from the config function would be passed to the section_prefix from the INI parser. I'm not sure we'll be able to mix and match the two in the config function cleanly, though.

I'm not sure 1. will ever happen as it may do a bit too much and never cover all use cases. For more complicated use cases, we'd be able to call the constructor of the ConfigurationSet directly to specify different prefix/parameter behaviors. Agreed with 2., we'd have to call the configs multiple times to filter with distinct prefixes.