simplistix / configurator

A Python library for handling configuration files.
https://configurator.readthedocs.io
MIT License
40 stars 6 forks source link

wrappers for lazy loading/computing configuration in relation to other configuration #7

Open RonnyPfannschmidt opened 2 years ago

RonnyPfannschmidt commented 2 years ago

i have a setup where credentials are loaded from vault into the configuration, this is done via lazy loading, so that even if there is a full configuration, only the necessary roundtrips are made

if there was a tool to have config nodes load deferred when they are used (not merged) that would be a help

cjw296 commented 2 years ago

When providing a lazy-loaded config, what would be a nice API to let you specify that?

RonnyPfannschmidt commented 2 years ago

a key here is that not the "config" is lazy loaded, but rather a specific value/subtree

# myconfig.yaml
users:
   myuser:
     username: !from_vault {path: a/b, key: username}
     username: !from_vault {path: a/b, key: password}

then i would like to have something like

class VaultResolver(ConfigResolver):
   path: str
   key: str

   def resolve(self, config: Config, path: Path):
      return config.vault_loader.value_from(self.path, self.key)
cjw296 commented 2 years ago

Are you able to share your current implementation of from_vault and your preferred way of registering it with pyyaml?

~Also, in your example, is self.config a configurator Config object?~ If so, how do you see .vault_loader coming into existence?

RonnyPfannschmidt commented 2 years ago

that as a typo, its only config

ideally a way to override the default yaml loader with one thats extended would be the mechanism of choice, as pyyaml and ruamel.yaml differ, i left it out

class VaultResolver(ConfigResolver):
   path: str
   key: str

   def resolve(self, config: Config, path: Path):
      return config.vault_loader.value_from(self.path, self.key)

class MyLoader(yamllib.Loader):
  pass

MyLoader.register(...., VaultResolver)

def vault_loader_for_settings(config):
  ...

def get_settings(given_vault_loader = None)
   config = config.from_file(..., parser=MyLoader.load) #insert correct call here
   config.vault_loader = given_vault_loader or vault_loader_for_settings()
   return config
RonnyPfannschmidt commented 2 years ago

the current impl of the vault loader looks like

class VaultSecretFetcher:
    @classmethod
    def from_settings(cls, settings):
        mountpoint = settings.get("REDACTED_VAULT_MOUNT_POINT") or "secret"
        if settings.get("REDACTED_VAULT_LOADER_ENABLED"):
            client = _create_vault_client(settings)
            _login_and_renew_token(client, settings)
        else:
            client = None
        return cls(mountpoint=mountpoint, client=client, settings=settings)

    loaded_secrets: Dict[str, Dict[str, object]]

    def __init__(self, mountpoint: str, client: Client, settings: Settings):
        self._client = client
        self._mount_point = mountpoint
        self._settings = settings
        self.loaded_secrets = {}

    def _get_path_secret_from_vault(self, path):
        if self._client is None:
            raise InvalidPath(
                f"Unable to load path '{self._mount_point}/{path}' when vault client is disabled"
            )
        if not self._client.is_authenticated():
            _login_and_renew_token(self._client, self._settings)
        try:
            data = self._client.secrets.kv.read_secret_version(path, mount_point=self._mount_point)
        except InvalidPath:
            # Give more details in the InvalidPath error message
            raise InvalidPath(f"Unable to load path '{self._mount_point}/{path}'")
        else:
            return data.get("data", {}).get("data", {})

    def _get_path_secret(self, path):
        if path not in self.loaded_secrets:
            self.loaded_secrets[path] = self._get_path_secret_from_vault(path)
        return self.loaded_secrets[path]

    def get_value_from_vault(self, path, key):
        data = self._get_path_secret(path)
        if key not in data:
            raise VaultError(f"key '{key}' not found at path '{self._mount_point}/{path}'")
        log.debug("loaded vault secret from %s/%s", self._mount_point, path)
        return data[key]
RonnyPfannschmidt commented 2 years ago

@cjw296 i think it may be a good idea to have the resolving of lazy objects be part of a wrapper

the patterns that can be implemented are

  1. resolve all lazy values in a part of the configuration as either new dictionary, or a new config object
  2. have a wrapper that looks like config but resolves lazy objects dynamically (resolver is responsible for caching)
  3. have a wrapper that looks like config and replaces dynamic references on first use with the resolved value
cjw296 commented 2 years ago

I've been playing around with this a fair bit, observations so far:

Thinking through your points:

  1. I reckon Config is going to end up with some kind of visitor helper or pattern, this will cover this case as well as helping with #10.
  2. If the loading of lazy objects happens as described in my second point above, I reckon we can get this: The lazy object / resolver gets asked for the value each time, and it gets to decide if it wants to cache. I was excited when I thought we could use the descriptor __get__ for this, unfortunately almost none of the ways .data is obtained from a container use this. 🤦
  3. This feels like it could end up brittle and also breaks my contract in the first bullet above: .data and everything it contains is "your" stuff and configurator shouldn't get involved with it.
cjw296 commented 2 years ago

Oh, and yes, plugging in parsers in a more flexible way will need to be part of implementing this. Here's my current sketch test case:

    def test_lazy_load(self, dir):
        yaml = pytest.importorskip("yaml")

        class LazyVault:
            def __init__(self, key):
                self.key = key

            @classmethod
            def from_yaml(cls, loader, node):
                return cls(loader.construct_mapping(node)['key'])

            def load_config(self):
                assert self.key == 'someuser'
                return {'name': 'Some User', 'password': '...'}

        class Loader(yaml.Loader):
            pass

        Loader.add_constructor('!from_vault', LazyVault.from_yaml)

        path = dir.write('lazy.yml', '''
        users:
        - !from_vault {key: someuser}
        ''')
        config = Config.from_path(path, parser=lambda p: yaml.load(p, Loader))
        compare(config.users[0].data, expected={'name': 'Some User', 'password': '...'})

        # make sure clone still works:
        config.clone()
        compare(config.users.data[0], expected=LazyVault(key='someuser'))
RonnyPfannschmidt commented 2 years ago

A key reason why I wanted to suggest a wrapper is that it would allow item/attributes access to be handled transparent while allowing the wrapper to be the location of both resolvers and caches separated from the other objects

So the config itself would be raw data, and a resolving config would handle the rest

I'll type up a example once im back to the computer

cjw296 commented 2 years ago

Yeah, I tried moving towards having ConfigNode instances used much more and it started getting messy quickly. So, here's another idea, which I think will work well for both #9 and #7: https://github.com/simplistix/configurator/pull/12

I'm planning on adding DataMapping and DataSequence, but I think DataValue is actually the most invasive one so wanted to start there.

Thoughts?

RonnyPfannschmidt commented 2 years ago

I'm currently under the impression that sequences and maps can be easily extended with a shadow sequence that tracks data origins +yaml/toml/json loaders that provide them

My impression is, that this would be painful for values as they would require a pretty magical wrapper

A key reason why I think collections should track is, that it would play nicely with integration of merges

cjw296 commented 2 years ago

I think we're agreeing with each other? Thankfully, looks like the wrapper for values doesn't actually end up being that magical :-) In fact, I think the necessary changes are now in #12 !