rbgirshick / yacs

YACS -- Yet Another Configuration System
Apache License 2.0
1.27k stars 89 forks source link

What is the advantage of using yacs instead of traditional yaml files and yaml parsing? #56

Open minimatest opened 2 years ago

minimatest commented 2 years ago

I know. A stupid and naive question. But I am a beginner and I am struggling to find a major use case for my workflow so maybe if the advantages can be spelled out, that would be even better!

Also, does YACS support the notion of assigning python objects and variables to the configuration parameters (unlike YAML)? For example, if one wanted to specify a callback function as a parameter, is it possible to do this with YACS (again of course, you can't specify this in YAML and the way around it is to make a hacky getattr call).

jveitchmichaelis commented 1 year ago

Here are my thoughts on this, as a user (tldr - YACS is simple and lightweight, but there are a few options out there). First, what's a "good" configuration system?

YACS is quite simple - it's < 500 lines of code, but it handles a lot of tedious edge cases that you'd otherwise have to implement yourself. It's also well tested - a lot of people have used Detectron. I'm curious to hear what other people are using, because there don't seem to be many configuration management systems for Python. There is hydra which is a lot heavier, but can also handle more complex things like running jobs. Hydra is built on OmegaConf. There's also anyconfig and dynaconf. All of these do mostly the same things and built to solve the same problems. There are also libraries like schema, which you could use to validate some structure and apply defaults.

If you just need to load a few options from a flat file, then there isn't any issue using the 'ini' format, or plain YAML. But at some point you can outgrow those. You'd need to implement some kind of dictionary wrapper if you want to use attributes as keys, instead of strings. I think this is much cleaner to look at, but personal preference. I think configparser allows you to set defaults, so you do have some basic inheritance there.

Let's take the example of inheriting a config and updating a single value. You need to write the logic to sanely update a nested dictionary. The union operation | is not sufficient. If you have {a: 1, b: {c: 2, d: 3}} and you unite with {b: {c: 4}}, Python will just clobber all of b and b.d no longer exists. You could write a function to do this, but YACS provides that and it also handles multiple levels of inheritance through different files. Other libraries also do some fancy interpolation stuff e.g. referencing variables in the config and more "dynamic" parameters.

YACS provides a bunch of convenience functions to allow merging from a dictionary, from a yaml file on disk and others. The readme has some information on the philosophy here, which you may or may not agree with (e.g. command line arguments are handled in a somewhat non-standard way if you're used to using argparse).

There isn't much in the way of validation, you could look at a library like schema for that. However YACS should complain if you provide a key which isn't in the default config. This is nice because it prevents users from making typos. It also allows you to flag configuration options as deprecated, renamed, whether you will allow merged configurations and so on. Finally it also does some type coercion checking e.g. if you try to update an int with a string.

YACS also allows you to enforce immutability so that once you've started your experiment/app and you handle user input, everything is frozen. This is very useful for repeatable experiments because you can log/store the config and you should be able to trust that the parameters weren't modified later on.

You also get serialisation, but that's a very thin wrapper around yaml dump and you could easily customise it.

Functions-as-values is an interesting case. I'm not sure what the cleanest way to handle that is. This is a nice blog post that discusses using a registry decorator: https://julienbeaulieu.github.io/2020/03/16/building-a-flexible-configuration-system-for-deep-learning-models/