fidelity / spock

spock is a framework that helps manage complex parameter configurations during research and development of Python applications
https://fidelity.github.io/spock/
Apache License 2.0
122 stars 13 forks source link

Interpolation and Accessing environment variables and other resolvers inside the config.yaml similar to OmegaConf #243

Closed svenstehle closed 1 year ago

svenstehle commented 2 years ago

Is your feature request related to a problem? Please describe. Cannot access e.g., environment variables in config (can I access and compose from other config variables and append to them?). This would be the oc.env resolver.

Maybe a separate feature request but Spock already offers some (or all, haven't checked?) of the oc.decode functionality.

Describe the solution you'd like save_path: ${oc.env:PWD}/runs

and

some_value: path_to_a_base_dir
composed_value: ${path_to_a_base_dir}/some_other_dir
composed_value_building_on_that: ${composed_value}/another_sub_dir

More examples found in OmegaConf.

Describe alternatives you've considered Nick mentioned this might be done with the underlying library attrs (?)

Additional context Started discussing the idea in this issue

Thanks for considering to work on this :) As I already mentioned in the other thread/issue: happy to help, but no idea where to begin and how to move into a worthwhile direction without guidance. Would love to work on this together though.

ncilfone commented 2 years ago

Just thinking out loud here which might help you grok some of the internals of spock so you can contribute if you're up for it...

Basics

Most of the config values get set in this file here

These field handlers reconcile values set in the class definition or those specified in the config files by building a dictionary and then passing that to the actual instance constructor to instantiate the actual object (this is where attrs is handling type checking, nested types, etc. via the validators functionality). There are different 'types' of resolvers as some fundamental `types' need to be handled differently (e.g. list or tuple is very different than int from a python type perspective) to make sure nesting, class references, etc. work correctly

These field handlers are called by iterating through the dependency graph of classes here as there can be references to other spock classes, thus using a graph makes sure parent-child relationships (via a directed graph) are respected in the correct order.

So here are my thoughts for the requested features (fyi: we should probably break these into at least 2 features/PRs since I think they will be pretty independent of each other)

Environment Resolver (Easier to implement)

Within the field handlers we simply need to regex for the syntax we devise (probably something similar to OmegaConf but shorter) such as $env:PWD. Given spock is more strongly typed than OmegaConf, we should just bake in the 'decode' functionality -- basically post getting the env variable just attempt to cast into the specified type (the parameter will have the typing annotation that we can just look up) in a try/except.

All this should be able to be done within the field handlers (I think at the base level i.e. not type specific) right before the assembled dictionary gets passed to the class object instantiation.

Variable Resolver (Harder to implement)

Called 'config node interpolation' in OmegaConf...

First thought here is that we will need to build another dependency graph between value references (fyi there is already a graph class to handle this here) to make sure we can instantiate objects in the correct order (again via a directed graph). Then when we discover a reference in the dictionary before object instantiation we can just go 'look-up' the requested value in the already built object (the Spockspace).

Syntactically this is similar to above where we would regex for the syntax we want to use (probably just like OmegaConf) such as: ${spock_class.my_value} to find the values we need to look up in other class objects. We would need to support full class paths (as above where the spock class name is prepended in dot notation -- and even for nested classes etc.) and when referring to a value within the defined class as in your example above (e.g. ${my_value} where there is no class prepend which implies that it is a reference to a parameter within the same class)

Again I think this should be able to be handled right before class instantiation in the field handler. The graph would have to be built before then though, so we can have the correct iteration order. This is where it will be a bit tricky because we need to reconcile the order of the class dependencies and the value dependencies into one order. Not too hard but just need to think it through...

Nick

svenstehle commented 2 years ago

Hi Nick,

thanks for your detailed thoughts and explanation on this. I will start on the Environment Resolver in the first PR, "easier" to implement sounds good to me :)

ncilfone commented 2 years ago

@svenstehle if you are still interested #254 should implement the Environment Resolver feature above.

Apologies that you couldn't ever get the unit tests to run for #246... You might have just needed to make sure the src dir of spock was added to your $PYTHONPATH env variable.