Idea: allow function mappers

omrihar commented 5 years ago

Hi, first of all, thanks for this cool project :)

I'm converting the config file from https://github.com/tiangolo/full-stack-fastapi-postgresql/ to use configurator and I'm trying to tidy it up by dividing into different sections. There are a lot of environment variables loaded in that config, and they mostly follow a pattern, so I was wondering whether it would be helpful to allow for parsing the environment variables as they are loaded. So, for example, run something like this:

def mapper(s: str) -> str:
    header = dict(SMTP='email', POSTGRES='db', ...)
    prefix = s.split("_")[0]
    suffix = "_".join(s.split("_")[1:]
    if prefix in header:
        return f"{header[prefix]}.{suffix.lower()}"
    return s.lower()

config.merge(os.environ, mapper)

This is, of course, just a rough idea rather than a fully functioning example. Optionally if None is returned perhaps the value can be excluded from the merge.

Currently I'm adding all the environment variables I need explicitly to the mapper, but I'm not sure it's ideal.

Cheers, Omri

cjw296 commented 5 years ago

I really like this idea! I know a few others things in this area, including Starlette's config stuff that allow for environment variables to be picked out based on a prefix.

I think the callable should take the whole source and target though, as it'd allow for more flexibility.

The tricky one, though, is how to support type coercion for environment variables in a useful way?

omrihar commented 5 years ago

hmmm... good point. Maybe allow the mapper to return a tuple with type information, if relevant? This way all the logic will be handed over to the mapper.

cjw296 commented 5 years ago

The tricky bit from what I can see, is how would you tell the mapper function that certain environment variables are of a particular type?

I'm also wondering if the mapper function should be responsible for returning a mapping dictionary, rather than actually doing any manipulation?

omrihar commented 5 years ago

I thought the mapper function will run once per environment variable, and depending on the logic implemented in it, will just return the name of the field to add rather than also the value or do any manipulation. This way it can implement the logic of the project. So, if in your project every environment variable ending with _PORT should be int, you simply add that. And if you have explicit typing in your envvar names, you can implement it.

I would envision something like (forgive the very crude implementation):

def mapper(x):
    if x.startswith('POSTGRES'):
      header = 'database'
   if x.endswith("PORT"):
      typehint=int
   fieldname = f"{header}.{x.split('POSTGRES_')[-1].lower()}"
   if typehint:
      return (fieldname, typehint)
   return fieldname

So if the mapper returns a tuple, it dictate type conversion. If it returns a string, it's a field name. If it returns None, skip that and don't add it to the config.

Does this makes sense to you?

cjw296 commented 5 years ago

It does, but remember that both sources and targets are arbitrarily nested in configurator, and I'm not sure I can think of a sensible way to feed that piece by piece to matter function.

How about this:

from configurator import source, target, convert
PREFIX = 'POSTGRES_'

def mapper(source, target):
    mapping = {}
    for key in source:
        if key.startswith(prefix):
            from_ = source[key]
            if key.endswith('PORT'):
                from_ = convert(from, int)
            mapping[from_] = target['database'][key[len(PREFIX):].lower()]
    return mapping

Although, at this point I'm also wondering whether this should just be a function?

def from_env(node, prefix):
    for key, value in os.environ.items():
        if key.startswith(prefix):
            node.data[key[len(PREFIX):].lower()] = value
...
from_env(config.database, 'POSTGRES_')

...with some finesse for type conversion of certain keys?

omrihar commented 5 years ago

That's a tough one. I was only thinking about this feature as a helper for environment variables, which are by definition not nested. Would you consider adding this functionality only for flat sources (such as envvars)? Or would you prefer to keep it more general? Perhaps it could be embedded in a more general method that also hides the part about needing to use os.environ.

So instead of using merge, you could do something like


def my_fancy_mapper(x):
    ....
    return # what we said before

config.merge_envvars(mapper=my_fancy_mapper)

This way you could even support calling it without a mapper, for people who just want to have all of their environment variables attached to config, perhaps replacing defaults, or calling stuff like

config.merge_envvars(mapper=str.lower)

cjw296 commented 5 years ago

I think I'm leaning towards a specialist helper function for environment variables, how about configurator letting you do something like this?


from configurator import Config, merge_env_into

config = Config(...)
merge_env_into(config.databases, prefix='POSTGRES_', types={'_PORT': int})

cjw296 commented 5 years ago

Alternatively, how about this:


from configurator import Config, EnvMerger

config = Config(...)
config.merge(os.environ, mergers=EnvMerger(prefix='POSTGRES_', types={'_PORT': int}))

cjw296 commented 5 years ago

One more idea:

from configurator import Config, config_from_env

defaults = Config(...)
env = config_from_env(prefix='POSTGRES_', types={'_PORT': int})

config = defaults + env

omrihar commented 5 years ago

I like your last one, but how would it handle multiple prefixes? So in my example I actually have these envvars (I'm using the fullstack fastapi cookiecutter from https://github.com/tiangolo/full-stack-fastapi-postgresql/):

POSTGRES_DATABASE
POSTGRES_USER
POSTGRES_PASSWORD
....
SMTP_HOST
SMTP_PORT
SMTP_TTL
SMTP_USER
...
PROJECT_NAME, ...

Which I would like to ideally keep in different levels. How about:

from configurator import Config, config_from_env

defaults = Config(...)
prefixes = {
    'POSTGRES_': 'database',
    'SMTP_': 'emails',
    'default': 'app'
}
env = config_from_env(prefixes=prefixes, types={'_PORT': int})

config = defaults + env

cjw296 commented 5 years ago

I don't know that the 'default' thing will work, that will put the whole env into the config. How do you tell which environment variables you want, versus which are unrelated to your app?

omrihar commented 5 years ago

Maybe the 'default' thing can be ... instead? And it will just mean anything you didn't add already, should be added under app. I agree it's a bit coarse. Perhaps explicit is better than implicit. So just define any envvar you want. On the other hand, I would sometimes want to have access to all envvars, and don't mind if some irrelevant ones are also there.

How would you handle name conversion, though? Would you lower case everything? Or add another option to transform the field names?

cjw296 commented 5 years ago

Can't use ..., python 2 support, like it or not, is important for a number of use cases. I don't like that pattern from FastAPI either, ... is not very obvious.

"lowercase everything" seems like a sensible default, but an optional parameter to pass in your own transformation callable seems like an easy one to do.

omrihar commented 5 years ago

yes, I meant if you go the config_from_env route, you'll need a way to specify field name conversion. I mean if there is no callable merger function.

cjw296 commented 5 years ago

Okay, I've got a PyCon UK talk on FastAPI I need to work on for the next week or two, but let me know if this ends up blocking you and I'll try and get it sorted!

omrihar commented 5 years ago

Thanks @cjw296, it's not blocking me at all (I've done this manually). It's just a feature idea I came up with while switching to configurator (so I can finally use the same testing procedure you use for your diary app, as we discussed a long time ago).

I'll attend PyCon and PyData Berlin in October, was wondering if anyone would be there / present FastAPI ;) Too bad I'm not going to the UK one...

cjw296 commented 4 years ago

Apologies for the delay, would you mind taking a look at https://github.com/Simplistix/configurator/pull/5 and see if it meets your needs?

cjw296 commented 4 years ago

Hmm, no feedback, so I'm going to go with the PR I mentioned. Just need to add docs and then release.

simplistix / configurator

Idea: allow function mappers #4