nerfstudio-project / nerfstudio

A collaboration friendly studio for NeRFs
https://docs.nerf.studio
Apache License 2.0
9.45k stars 1.29k forks source link

Load config from yaml file and allow to overwrite arguments in the train.py script #1281

Open pschroeppel opened 1 year ago

pschroeppel commented 1 year ago

Hi, with the scripts/train.py script, it is currently possible to load a config from a yaml file via the --load-config path/to/config.yaml argument. However, when this is done, all other command line arguments are ignored.

For example calling:

python scripts/train.py vanilla-nerf --load-config=/path/to/config.yaml --vis=viewer blender-data

would use a tensorboard visualization, if this is what was indicated in the config.yaml. Instead of this, in my opinion it would be nice to use the config.yaml as defaults, but overwrite specific values based on given command line arguments.

I'm not experienced with tyro, so I find it difficult to implement this. Is there an easy way to achieve what I want? best, Philipp

brentyi commented 1 year ago

Hi!

This is a good idea and sounds doable. One option is to:

  1. Load the contents of the YAML into a config object before any of the training script'styro logic.
  2. Pass a default=the_loaded_config_object argument into the tyro.cli() call here: https://github.com/nerfstudio-project/nerfstudio/blob/828d6bc893683042b3763247b65b3adeb5d5ea7e/scripts/train.py#L248-L251 This will tell tyro to read default values from the loaded object.

The question then becomes: how do we get the path to the YAML config before tyro.cli() is called? Two options I'm thinking are:

  1. We could remove load_config from the TrainerConfig object entirely, and instead use an environment variable. This would avoid the chicken/egg problem where load_config is in the config, but the config should be instantiated based on the contents of load_config.

    Usage would then look something like:

    NS_LOAD_CONFIG=../some/path/to/config.yml python scripts/train.py vanilla-nerf --vis=viewer blender-data
  2. A little more user-friendly: we could preemptively parse --load-config from sys.argv before any of the other fields. argparse + parse_known_args() might be useful for this.

Does this make sense?

pschroeppel commented 1 year ago

Hi, I tried now to implement option 2 (preemptively parse --load-config from sys.argv). For this I changed the entrypoint function in scripts/train.py as follows:

def entrypoint():
    """Entrypoint for use with pyproject scripts."""
    # Choose a base configuration and override values.
    config_file_parser = argparse.ArgumentParser("Parser for loading config from file")
    config_file_parser.add_argument('--load-config', type=Path, dest='config_path')
    args, _ = config_file_parser.parse_known_args()

    if args.config_path:
        CONSOLE.log(f"Loading pre-set config from: {args.config_path}")
        loaded_config = yaml.load(args.config_path.read_text(), Loader=yaml.Loader)
    else:
        loaded_config = None

    tyro.extras.set_accent_color("bright_yellow")
    main(
        tyro.cli(
            AnnotatedBaseConfigUnion,
            description=convert_markup_to_ansi(__doc__),
            default=loaded_config,
        )
    )

However, tyro doesn't accept the loaded_config and fails. It first gives warnings:

/misc/lmbraid18/schroepp/virtualenvs/miniconda3/envs/nerfstudio-rtx/lib/python3.8/site-packages/tyro/_fields.py:385: UserWarning: Could not find field __tyro_dummy_field__ in default instance TrainerConfig

/misc/lmbraid18/schroepp/virtualenvs/miniconda3/envs/nerfstudio-rtx/lib/python3.8/site-packages/tyro/_fields.py:677: UserWarning: Mutable type <class 'nerfstudio.engine.trainer.TrainerConfig'> is used as a default value for `__tyro_dummy_field__`. This is dangerous! Consider using `dataclasses.field(default_factory=...)` or marking <class 'nerfstudio.engine.trainer.TrainerConfig'> as frozen.

and then fails with an error message starting with:

Exception has occurred: AssertionError
`` was provided a default value of type <class 'nerfstudio.engine.trainer.TrainerConfig'> but no matching subcommand was found. A type may be missing in the Union type declaration for ``, which is currently set to typing.Union[typing_extensions.Annotated[nerfstudio.engine.trainer.TrainerConfig, _SubcommandConfiguration(
...

I'm not experienced with tyro and have difficulties understanding the problem. Do you know what is going on? best, Philipp

brentyi commented 1 year ago

Hi Philipp — thanks for taking the time to try this.

I experimented a bit with your implementation, and it seems like the sophisticated subcommands in nerfstudio make this problem harder than I initially imagined[^1]. This should be solvable, but to get something we'd be happy with I think I'll need to sit down with it for at least a few hours. A different solution or some upstream updates to tyro may be needed.

Added to my to-do list and I'll try to get to this this coming week, in the meantime if overriding a YAML is urgent it seems like a user can just copy/edit the YAML?

[^1]: To elaborate: when we pass in a default for a type with subcommands, tyro needs to figure out which subcommand the default corresponds to (is this config yaml from ns-train nerfacto? ns-train depth-nerfacto? ns-train instant-ngp?). For the vast majority of cases this matching is straightforward, but nerfstudio's subcommands happen to not fall into this category.

pschroeppel commented 1 year ago

Hi, alright, thanks a lot for looking into this! And it's not urgent at all, as it's easy enough to circumvent the problem. It's more a matter of convenience :) best, Philipp

brentyi commented 1 year ago

Brief update: with https://github.com/brentyi/tyro/pull/33 merged and bundled in the latest release of tyro we are on track for getting this issue resolved. Wrapping up is on my to-do list but not sure when I'll get to it. 🥲

markdjwilliams commented 2 weeks ago

If possible, could I kindly ask for an update on what work remains to complete the fix? I’m curious about the current status and if there are any plans for further progress.