omni-us / jsonargparse

Implement minimal boilerplate CLIs derived from type hints and parse from command line, config files and environment variables
https://jsonargparse.readthedocs.io
MIT License
314 stars 42 forks source link

Loaded config does not get saved when redirecting `print_config` to same config file #470

Closed awaelchli closed 5 months ago

awaelchli commented 5 months ago

šŸ› Bug report

If I load a config, and immediately save it again to the same file, all values reset to their defaults:

python repro.py --config out.yaml --print_config > out.yaml

It does not produce the same settings as just printing the config:

python repro.py --config out.yaml --print_config

This only happens when the file we read from is the same file we write to. While it may sound silly to load the config and immediately write it, the way I have found this issue was because I wanted to regenerate configs with comments via print_config=comments, but then saw it's changing the values.

To reproduce

from pathlib import Path
from typing import Optional, Union
from typing_extensions import Literal
from jsonargparse import CLI

def setup(
    model_name: Optional[str] = None,
    out_dir: Path = Path("out/pretrain"),
    resume: Union[bool, Path] = False,
    devices: Union[int, str] = "auto",
    tokenizer_dir: Optional[Path] = None,
    logger_name: Literal["wandb", "tensorboard", "csv"] = "tensorboard",
    seed: int = 42,
):
    pass

if __name__ == "__main__":
    CLI(setup)

Save this content to a yaml file:

model_name: name
out_dir: out/pretrain
resume: true
devices: auto
tokenizer_dir: tokenizer/dir
logger_name: csv
seed: 41

Note how it has different values than the defaults in the Python signature.

Now run

python repro.py --config out.yaml --print_config > out.yaml

You will see that the out.yaml file reverts back to default values.

Expected behavior

I expect it to be idempotent.

Environment

Thanks to @carmocca for finding out that it happens because the filename is the same šŸŽ‰

mauvilsa commented 5 months ago

There might be a bug, but not the one described. This behavior is due to how shells work. When the output of a command is redirected to a file, the redirection is executed before the command. That is > out.yaml is done first, and what this does is make out.yaml an empty file. So, --config out.yaml basically loads an empty file, which is why the defaults are printed. If you change the command to a redirection with concatenation python repro.py --config out.yaml --print_config >> out.yaml you can see that the print is not the defaults, and out.yaml would have two copies of the content due to the concatenation.

What might be a bug is the fact that providing an empty config does nothing. Maybe the command should fail in this case.

awaelchli commented 5 months ago

@mauvilsa Thanks, that's good to know. In this case I understand there is nothing jsonargparse can do. I'm fine writing out the config to a different file first and then renaming it.