Open luator opened 10 months ago
I am all for this! The kwargs make it so difficult to track all the different config settings. Bonus points if we can integrate with the command line interface and print help strings for the different settings.
On the implementation side, personally I would not be against switching away from smart settings, but this definitely needs a wider discussion. If it is based on omegaconf, imports probably could be handled with a custom resolver. There are also some config features like evals (implemented either in smart-settings or in cluster_utils) which would need to be supported in the new system.
It would probably also break backwards compatibility.
By Maximilian Seitzer on 2023-11-27T09:19:09 (imported from GitLab)
Jonas suggested to use Hydra. I never used it but it's also based on omegaconf and people with ML background may already be familiar with it, so might be worth to take a look at.
By Felix Widmaier on 2023-11-28T11:55:58 (imported from GitLab)
I strongly oppose Hydra. I think it would be a very bad idea to base cluster_utils on it.
(if needed, I can also write a more substantiated explanation of why I think that. The most basic version is that Hydra is not targeted at the type of application cluster_utils is.)
By Maximilian Seitzer on 2023-11-28T12:00:26 (imported from GitLab)
I did a small proof of concept to confirm that the include feature can be implemented via resolvers in omegaconf:
def _include_yaml(file):
return OmegaConf.load(file)
def main():
OmegaConf.register_new_resolver("include_yaml", _include_yaml)
conf = OmegaConf.load("main_cfg.yml")
OmegaConf.resolve(conf)
print(OmegaConf.to_yaml(conf))
Then you can use foo: "${include_yaml:extra_values.yml}"
in the config file.
This only works for yaml as is, but could easily be extended to support different formats by checking the file extension.
Of course it would break compatibility with old configs, as the syntax for the includes changes.
By Felix Widmaier on 2024-01-08T12:05:22 (imported from GitLab)
One more UX improvement in this direction: it would be nice to reject unknown configuration values. Right now, when I mistype an optional parameter, I am not warned and my code silently executes differently than intended.
This should be fully solved by parsing the configuration into a structured format.
Currently there is no explicit definition of the structure of the configuration. The config file is just read to a dictionary and parts of it are even passed around as kwargs (like
foo(**params["foo_settings"])
). While this is quite flexible, it also makes it almost impossible to understand the complete set of parameters that can be configured.Therefore I propose to refactor the code to load the settings into a clearly defined data structure (e.g. a dataclass or namedtuple). Advantages are:
If the request is accepted, it still needs to be decided how to implement it. One option could be to move from smart_settings to variconf (disclaimer: I wrote it), but I'm not sure if omegaconf has an import feature, so maybe that wouldn't work without loosing functionality. In the easiest case, we could stick with smart_settings and transfer everything to a NamedTuple-structure after loading.