Open armandgurgu23 opened 2 years ago
hi @armandgurgu23 thanks for the kind words, glad to hear Hydra helps!
The timestamped output dir could be easily configured https://hydra.cc/docs/configure_hydra/workdir/
we will look into support disabling the creation all together, but at the same time you can easily override the output dir for your experimental runs to a tmp folder for example
python myapp.py hydra.run.dir=/tmp
Note that hydra.run.dir
does not apply in multirun mode; hydra.sweep.dir
must be used in multirun mode.
Let me follow up on Jieru's comment with an example using config groups: Here is the idea:
$ python app.py save=timestamp # use timestamped output folders
$ python app.py save=tmp # use a directory called output/scratch
$ # same for multirun:
$ python app.py -m save=timestamp
$ python app.py -m save=tmp
This can be achieved with the following config files:
Here is conf/config.yaml
:
defaults:
- save: tmp
- _self_
Here is conf/save/tmp.yaml
:
# @package _global_
hydra:
run:
dir: output/scratch
sweep:
dir: multirun/scratch
Here is conf/save/timestamp.yaml
:
# empty
If you repeatedly use the save=tmp
setting, the contents of output/scratch
from previous app runs will be overwritten.
Given that the file conf/save/timestamp.yaml
is empty, the following are equivalent:
$ python app.py save=timestamp
$ python app.py save=null
Given that the setting save: tmp
appears in the defaults list of the primary config file, the following are equivalent:
$ python app.py
$ python app.py save=tmp
This is to say that the scratch directory output/scratch
is used by default. This could be changed by instead using the setting save: timestamp
in the defaults list of the primary config.
we will look into support disabling the creation all together
Sounds good. I think there will be some interaction with the new hydra.job.chdir
setting: if chdir=True
then we need to call os.chdir(output_dir)
, so in that case we can't disable the creation of the output directory.
keeping this open for now to see if there's more interests on disabling the working dir creations
π Feature Request
Add a CLI flag that prevents the creation of timestamped folder structures inside of
outputs/
andmultirun/
when executing a script decorated using @hydra.main().Motivation
Prior to this, I would like to thank the developers of Hydra for creating such a useful and versatile tool :). I'm relatively new to Hydra and I have been adapting this tool as part of my ML experimentation lifecycle, due to its useful automatic creation of timestamped directories during code execution π .
However I have experienced a painpoint with regards to the feature described above when using Hydra for long term experimentation (> 1 month). I have found that overtime you can have a buildup of directories in
outputs/
andmultirun/
respectively (the default folders where Hydra catalogues your main script execution). These directories may not contain useful configuration dumps. (ie: you are making modifications/troubleshooting your main experiment script, which is decorated with Hydra)Pitch
One quality of life improvement would be to add the ability to conditionally disable the creation of timestamped folders inside of
outputs/
andmultirun/
. I believe the best user experience for this feature would be to build a flag for this behaviour and be able to pass the value of this flag when executing the main script (decorated with hydra.main()) through the command line.I believe the feature request above would be valuable for experiment organization, since it would prevent creating timestamped folders during script troubleshooting + experiment development.