facebookresearch / hydra

Hydra is a framework for elegantly configuring complex applications
https://hydra.cc
MIT License
8.39k stars 609 forks source link

[Feature Request] Override loaded config file through a command line flag #386

Closed tamuhey closed 4 years ago

tamuhey commented 4 years ago

🚀 Feature Request

I want to pass yaml file instead of parameters like foo.bar=1 ... to override default configuration:

$ python app.py --yaml foo.yaml
omry commented 4 years ago

Can this file be inside the search path (in the same directory as the config file you specify in @hydra.main() )?

If so, try:

conf/
  config.yaml
  experiment/
      exp1.yaml

and from the command line:

$ python foo.py experiment=exp1

This will override your config file with exp1.

tamuhey commented 4 years ago

Can this file be inside the search path (in the same directory as the config file you specify in @hydra.main() )?

No. I'm creating a cli with hydra, and this use case is for the user, so the config file cannot be placed inside conf directory. The pattern of the modification is so enormous that cannot be defined in advance.

tamuhey commented 4 years ago

The reason why I use hydra is that the configuration of my cli is a little bit complexed and hydra is suitable for my cli. This app is closed source now, but will be opened before long. (After opened, I'll tell you that)

EtienneDavid commented 4 years ago

Yes it would be definitly helpful to be able to change the config_name from the CLI interface. Hydra sounds super helpful to simplify the architecture of some of my modules but the user should be able to redefine the config from scratch...

omry commented 4 years ago

This is more involved than it sounds. config_path is currently describing both the config search path relative to the Python file with hydra.main, and optionally a config file to load from the search path. There are a few related issues open for allowing some control over the search path from the command line as well this one. When loading a config file using some command line override, what does it do (if anything) to the search path?

This is also a user facing interface. which means it's not very forgiving to many changes. I want to get this right from the first time.

There are other higher priority issues I am working on. I will get to it.

omry commented 4 years ago

This is a clean workaround that addresses most of the use cases:

In conf, create a directory called experiment:

conf\
  experiment\
    exp1.yaml
    exp2.yaml

Each one can specialize the generated config, for example exp1.yaml can be:

exp1.yaml:

learning_rate: 0.1
batch_size: 32

Now from the command line: $ python foo.py experiment=exp1

There are some use cases where an external config file can be useful, but this should cover a large number of the use cases.

jbohnslav commented 4 years ago

Hi there,

I just wanted to +1 for this feature request. My use case is as follows. I have a complex configuration file with lots of details the user will not need to interact with, e.g. optimizers. There are elements, particularly regarding the data and augmentations, that all users will have to customize.

Default conf/config.yaml:

train
  lr: 0.0001
  optimizer: adam
dataset
  name: imagenet
  classes: 
    - tench
    - goldfish
    - great white shark
    - ... etc
augs
  resize: 224
  flip_lr: true
  flip_ud: false

user_config.yaml:

dataset
  name: not hotdog
  classes:
    - hotdog
    - not hotdog
augs
  resize: 64
  flip_lr: true
  flip_ud: true

What I want is to be able to use the following syntax and have Hydra load the default configuration file first, and optionally override with the user's supplied config file: python train.py --config user_config.yaml

Ideally, the user could also supply command line arguments which would take precedence over all. I'm not sure if this is too unique to my use case. I want the priority to be command line args > user defaults from user_config.yaml > conf/config.yaml.

Is there a way to do this currently with hydra?

omry commented 4 years ago

Please reach out on the chat.

SunQpark commented 4 years ago

I'm currently using a workaround for a similar case.

Default conf/config.yaml:

user_config: # path to config file to update.
train:
  lr: 0.0001
  optimizer: adam
dataset:
  name: imagenet
  classes: 
    - tench
    ...

Then, in the train.py you can override hydra config object with OmegaConf.merge function when user_config is given.

if config.user_config is not None:
    user_config = OmegaConf.load(config.user_config)
    config = OmegaConf.merge(config, user_config)

Now, you can use python train.py user_config=user_config.yaml

gunthergl commented 4 years ago

This comment goes in the direction of @jbohnslav, I think @tamuhey 's question can, but does not have to be connected with it because in this case I assume you can have a userconfig inside the search path.

├── conf
│   ├── config_default.yaml
│   ├── cfgs
│   │   ├── base.yaml
│   │   └── userconfig.yaml
└── my_app.py

config_default.yaml:

defaults:
  - cfgs: base
  - cfgs: userconfig

base.yaml:

db:
  first_param: 1

userconfig.yaml:

db:
  userdefined_param: 2

my_app.py

import hydra
from omegaconf import DictConfig

@hydra.main(config_path="conf/config_default.yaml")
def my_app(cfg: DictConfig) -> None:
    print(cfg.pretty())

if __name__ == "__main__":
    my_app()

And the output is:

db: first_param: 1 userdefined_param: 2

For multirun, I added userconfig2.yaml inside cfgs:

db:
  anotherParam: 2

Then start via: python my_app.py cfgs=userconfig,userconfig2 -m

[2020-05-16 12:04:38,165][HYDRA] Sweep output dir : multirun/2020-05-16/12-04-38 [2020-05-16 12:04:38,166][HYDRA] Launching 2 jobs locally [2020-05-16 12:04:38,166][HYDRA] #0 : cfgs=userconfig db: first_param: 1 userdefined_param: 2

[2020-05-16 12:04:38,246][HYDRA] #1 : cfgs=userconfig2 db: anotherParam: 2 first_param: 1

I do not know what happens here exactly but as long as it works..

gunthergl commented 4 years ago

I just found that it is sufficient to have the following main config file config_default.yaml:

defaults:
  - cfgs: base
  - cfgs: base

Then generally, its base and you can overwrite it on the commandline.

Still, I do not know if there are unwanted side effects.

omry commented 4 years ago

@gunthergl, I mentioned this approach in multiple comments above. This is also documented in the tutorial to some extent here.

moinfar commented 4 years ago

Feature Request

I want to pass yaml file instead of parameters like foo.bar=1 ... to override default configuration:

$ python app.py --yaml foo.yaml

Hi, I believe the requested feature is a must-have.

By the way, I suggest this workaround:

def main(cfg: DictConfig) -> None:
    print(cfg.pretty())

if __name__ == "__main__":
    config_path = "./conf/default.yaml"

    if len(sys.argv) > 1 and sys.argv[1].startswith("config="):
        config_path = sys.argv[1].split("=")[-1]
        sys.argv.pop(1)

    main_wrapper = hydra.main(config_path, strict=True)
    main_wrapper(main)()

This way you can specify your config file by adding config=somefile.yaml at the beginning of your command.

omry commented 4 years ago

This requested feature is also a will-have. Patience.

omry commented 4 years ago

This is coming, and by "this", I mean what I am ready to support at this stage: Hydra 1.0 is splitting the config_path into a config_path and config_name, you can learn about it here.

This feature is opening up the door for overriding the config name and the config path individually through a simple command line flag.

Limitations

  1. config_name is relative to the search path. this will not work for files in arbitrary locations on the file system.
  2. config_path is overriding the one in the file, if you override it you will lose access to configs in the current config_path mentioned in the Python file.

I am not planning on doing more than this for 1.0. If there are concrete uses cases where this is insufficient please open a new issue.

omry commented 3 years ago

874 is coming in Hydra 1.0.0rc3 and will probably address most uncovered use cases.

r9y9 commented 3 years ago

Wow, --config-dir option is exactly what I wanted. It worked perfectly in my use case, where I wanted to allow hydra app users to have custom configurations outside the package path. Looking forward to v1.0.0 release 💯

omry commented 3 years ago

Hydra 1.0.0 already has rc2 released which you can try today (pip install hydra-core --pre --upgrade). You will be able to try --config-dir now by installing from master.

npuichigo commented 3 years ago

@omry I think a use case is to reload the configs in .hydra if overriding happened during previous runs. For example

config
├── dataset
│   ├── cifar10.yaml
│   └── imagenet.yaml
├── eval.yaml
├── model
│   ├── alexnet.yaml
│   └── vanilla.yaml
├── train.yaml
└── trainer
    └── default.yaml

python train.py model.num_units=256

Then I get the overrided configs in .hydra:

.hydra
├── config.yaml
├── hydra.yaml
└── overrides.yaml

# overrides.yaml
- model.num_units=256

Now, I can reproduce my training with the following command line:

python train.py --config-path=output/xxxx-xx-xx/.hydra --config-name=config

That's great.

However, how can I explicitly combine the overrides.yaml with the primary configs in config directory? For example, here I want to reload my model for inference. What I want is:

# eval.py use config-path=config and config-name=eval, which is eval.yaml
python eval.py --override-yaml=output/xxxx-xx-xx/.hydra/overrides.yaml

# it's equivalent to
python eval.py --config-path=config --config-name=eval model.num_units=256

Maybe it's ad-hoc to split the config into train.yaml and eval.yaml. Here's another use case.

If I want the user to save their own overrides configs in their workspace, the combination with the primary config is needed.

egs
└── neural_network
    ├── config
    │   ├── large_network.yaml
    │   ├── medium_network.yaml
    │   └── small_network.yaml
    └── run.sh

# large_network.yaml
- model.num_classes=1024

# medium_network.yaml
- model.num_classes=512

# small_network.yaml
- model.num_classes=256

Here, the user may only want to override the num_classes of model

python train.py --override-yaml=egs/neural_network/config/large_network.yaml

# it's equivalent to
python train.py --config-path=config --config-name=train model.num_classes=1024

Of course, the workaround is to use command-line arguments directly, but I'm seeking for a more elegant way.

egs
└── neural_network
    ├── run_large.sh
    ├── run_medium.sh
    └── run_small.sh
npuichigo commented 3 years ago

@omry

omry commented 3 years ago

@npuichigo, please open a separate feature request.

dmarx commented 2 years ago

Another workaround option for anyone who needs it: I'm invoking a plugin that adds the user's current working directory to hydra's search path. Depending on how you integrate this into your application, it probably won't override defaults. At least in my use case, it's still preferable to requiring the users to provide the --config-dir flag (for now).

import os

from hydra.core.config_search_path import ConfigSearchPath
from hydra.plugins.search_path_plugin import SearchPathPlugin

# See also:
# https://hydra.cc/docs/advanced/search_path/#
# https://github.com/facebookresearch/hydra/issues/763

class PyttiLocalConfigSearchPathPlugin(SearchPathPlugin):
    def manipulate_search_path(self, search_path: ConfigSearchPath) -> None:

        local_path = f"{os.getcwd()}/config/"
        logger.debug(local_path)
        search_path.append(
            provider="myframework", path=f"file://{local_path}"
        )