facebookresearch / hydra

Hydra is a framework for elegantly configuring complex applications
https://hydra.cc
MIT License
8.82k stars 635 forks source link

[Feature Request] Is there a way to generate custom log dir name while running? #2615

Closed zhuoqun-chen closed 1 year ago

zhuoqun-chen commented 1 year ago

🚀 Feature Request

Hi, Hydra is an amzaing tool that helps me a lot to manage my experiments. I'd like to know whether I can name a log directory using a variable automaticly generated from within a running python script?

Motivation

For example, I like to have some output directory structure like this:

The reason why I need this because I need to do some post-processing from the log directory locally. So If I know like, for example, today's Exp-3 and Exp-17 has some good results then I only process data in those folders, and skip the rest of the folders either aborted because of error or having terrible raw data:

raw_data_3 = process("2023-03-17/Exp-3")
raw_data_17 = process("2023-03-17/Exp-17")

Pitch

I don't have a specific solution because I'm not familiar how hydra works.

Additional context

I've browsed some posts and docs, and I know top-level config.yaml allows missing default key like name: ??? and then I can set:

name: ???

current_time: ${now:%Y-%m-%d}_${now:%H-%M-%S}

hydra:
  # sets output paths for all file logs to `logs/experiment/name'
  run:
    dir: experiments/${name}/${current_time}

But I'm not sure if name can be read directly from a running script. So perhaps this may help me solve my question but I do need some examples showing me how to do it.

Many thanks to the developers!

omry commented 1 year ago

Hydra does not keep a running index of your jobs so it can't do what you are asking. Maybe it's possible to do something with callbacks. I imagine than you could maintain a file with the latest running experiment number, and update it in your config to be used via interpolation. Implementing this is out of scope for Hydra but maybe the folks at the hydra-callbacks repo would be interested in your idea.

As a side note, for multiruns - Hydra does maintain a running job number (hydra.job.num) documented here. But this starts at 0 for ever multirun you execute.