Open thoth291 opened 2 years ago
Hi @thoth291,
Here is a workaround that hopefully will work for you. It involves registering an OmegaConf custom resolver to calculate the set of keys represented by the syntax glob(app.*)
that you have proposed:
# simple_app.py
import os
from typing import List
from hydra.core.hydra_config import HydraConfig
import hydra
from omegaconf import DictConfig, OmegaConf
def my_glob_impl(pattern: str, _root_: DictConfig) -> List[str]:
"""
A simple glob implementation, takes a `pattern` with wildcard `*` at the
end. The return value is a set of full keys in the config which match the
`pattern`.
"""
assert pattern.endswith(".*")
pattern = pattern.removesuffix(".*")
node = OmegaConf.select(_root_, key=pattern)
if node is None:
return []
if not isinstance(node, DictConfig):
raise NotImplementedError
else:
return [f"{pattern}.{key}" for key in node.keys()]
OmegaConf.register_new_resolver(name="my_glob", resolver=my_glob_impl)
@hydra.main(config_path=".", config_name="config")
def my_app(_cfg: DictConfig) -> None:
exclude_keys = HydraConfig.get().job.config.override_dirname.exclude_keys
print(f"{exclude_keys=}")
print(f"Working dir: {os.getcwd()}")
if __name__ == "__main__":
my_app()
# config.yaml
params:
a: 4
name: test
app:
verbose: false
config: false
setup: true
hydra:
run:
dir: ${hydra.job.name}/single/${now:%Y-%m-%d}/${now:%H-%M-%S}
sweep:
dir: ${hydra.job.name} #_${hydra.job.id}
subdir: ${hydra.job.num}_${hydra.job.override_dirname}
job:
name: ${params.name}
override_dirname: ???
config:
# configuration for the ${hydra.job.override_dirname} runtime variable
override_dirname:
kv_sep: '='
item_sep: ','
# currently I'm using [app.verbose,app.config,app.setup]
exclude_keys: '${my_glob: app.*}'
$ # at the command line:
$ $ python3 simple_app.py params.a=4,5,6 app.verbose=true -m
[2021-11-03 19:40:55,849][HYDRA] Launching 3 jobs locally
[2021-11-03 19:40:55,849][HYDRA] #0 : params.a=4 app.verbose=True
exclude_keys=['app.verbose', 'app.config', 'app.setup']
Working dir: /Users/jasha10/hydra_tmp/tmp1873/test/0_params.a=4
[2021-11-03 19:40:55,941][HYDRA] #1 : params.a=5 app.verbose=True
exclude_keys=['app.verbose', 'app.config', 'app.setup']
Working dir: /Users/jasha10/hydra_tmp/tmp1873/test/1_params.a=5
[2021-11-03 19:40:56,054][HYDRA] #2 : params.a=6 app.verbose=True
exclude_keys=['app.verbose', 'app.config', 'app.setup']
Working dir: /Users/jasha10/hydra_tmp/tmp1873/test/2_params.a=6
The basic idea is to use the my_glob_impl
function to create the list of keys that you want to exclude. If you use this my_glob
resolver in your Hydra config, then the _root_
argument passed to my_glob_impl
will be the root config that is composed by Hydra. The glob functionality could be made more advanced by e.g. supporting wildcards in multiple places or allowing some recursive expansion (e.g. glob app.**
to get all nested keys).
Edit: I've reversed the order of the if not isinstance(node, DictConfig)
and the if node is None
blocks in the my_glob_impl
function body. The if node is None
check should come first, as otherwise that if-block will never be reached.
@Jasha10 , thank you for your suggestion! It works perfectly fine. One minor question is left - I can't get my head around how to add to that list. For example:
simple_app \
'+sweep={name:t4,a:4},{name:z50,a:50}' \
params.name='${sweep.name}' params.a='${sweep.a}' \
app.verbose=true \
-m
Then my exclude_keys
should have not only glob for app.*
but also should have sweep
as key to be excluded.
*The question is how to combine dynamical list from `'${my_glob: app.}'with static list
[sweep]`?**
Do I need to write yet another resolver or there is already available resolver for list merging?
You certainly could write yet another resolver:
OmegaConf.register_new_resolver(name="concat", resolver=lambda *lists: [elt for l in lists for elt in l])
exclude_keys: '${concat: ${my_glob: app.*}, [sweep]}'
Another option would be to modify the my_glob_impl
function above to take a variable number of arguments:
def multi_glob_impl(patterns: list[str], _root_: DictConfig) -> List[str]:
"""
Like `my_glob_impl`, with two differences:
- takes a list of `patterns` instead of one pattern
- it is allowed for patterns to not end with a wildcard `.*` in which
case no globbing is performed.
"""
ret = []
for pattern in patterns:
if pattern.endswith(".*"):
pattern = pattern.removesuffix(".*")
node = OmegaConf.select(_root_, key=pattern)
if node is None:
continue
if not isinstance(node, DictConfig):
raise NotImplementedError(type(node))
else:
ret += [f"{pattern}.{key}" for key in node.keys()]
else:
ret.append(pattern)
return ret
OmegaConf.register_new_resolver(name="multi_glob", resolver=multi_glob_impl)
exclude_keys: '${multi_glob: [app.*, sweep]}'
However, neither of these options allows you to extend exclude_keys
from the command line. To do that, let's suppose you have a top-level dict called my_excludes
in your config:
# config.yaml
params:
a: 4
name: test
app:
verbose: false
config: false
setup: true
my_excludes:
app: app.*
sweep: sweep
self: keys_to_exclude
hydra:
run:
dir: ${hydra.job.name}/single/${now:%Y-%m-%d}/${now:%H-%M-%S}
sweep:
dir: ${hydra.job.name} #_${hydra.job.id}
subdir: ${hydra.job.num}_${hydra.job.override_dirname}
job:
name: ${params.name}
override_dirname: ???
config:
# configuration for the ${hydra.job.override_dirname} runtime variable
override_dirname:
kv_sep: '='
item_sep: ','
# currently I'm using [app.verbose,app.config,app.setup]
exclude_keys: '${multi_glob: ${oc.dict.values: my_excludes}}'
Above we are using OmegaConf's built-in oc.dict.values
resolver to get a list of values from the top-level my_excludes
mapping. These values are then passed to the multi_glob
resolver to generate the exclude_keys
list.
So the default excludes will be app.*
, sweep
, and keys_to_exclude
.
You can override this from the command-line as follows:
$ python3 simple_app.py params.a=4,5,6 app.verbose=true '~my_excludes.app' '+my_excludes.p=params.*' -m
[2021-11-05 11:30:02,198][HYDRA] Launching 3 jobs locally
[2021-11-05 11:30:02,198][HYDRA] #0 : params.a=4 app.verbose=True ~my_excludes.app=null +my_excludes.p=params.*
exclude_keys=['sweep', 'keys_to_exclude', 'params.a', 'params.name']
Working dir: /home/jasha10/hydra_tmp/tmp1873/test/0_+my_excludes.p=params.*,app.verbose=True,~my_excludes.app=null
[2021-11-05 11:30:02,289][HYDRA] #1 : params.a=5 app.verbose=True ~my_excludes.app=null +my_excludes.p=params.*
exclude_keys=['sweep', 'keys_to_exclude', 'params.a', 'params.name']
Working dir: /home/jasha10/hydra_tmp/tmp1873/test/1_+my_excludes.p=params.*,app.verbose=True,~my_excludes.app=null
[2021-11-05 11:30:02,390][HYDRA] #2 : params.a=6 app.verbose=True ~my_excludes.app=null +my_excludes.p=params.*
exclude_keys=['sweep', 'keys_to_exclude', 'params.a', 'params.name']
Working dir: /home/jasha10/hydra_tmp/tmp1873/test/2_+my_excludes.p=params.*,app.verbose=True,~my_excludes.app=null
As you can see, with the '~my_excludes.app'
command-line override, we remove "app.*"
from the list of excludes, and with +my_excludes.p=params.*
we are adding the glob "params.*"
to the set of excludes.
The motivation for having the top-level my_excludes
be a dict instead of a list is that it is easier to manipulate a dict using the command-line syntax. The values of the my_excludes
dict are the important part (as the values of my_excludes
are what gets passed to multi_glob_impl
).
Looking back at the above, using a regex pattern will give the most flexibility when deciding which overrides to exclude. This would require implementing your own logic to construct the directory name based on the list of overrides that are used for the current job. You can access the list of overrides in ${hydra.overrides.task}
.
Here's what I have in mind, using regex patterns to see whether each override should be excluded:
import hydra
import os
from omegaconf import OmegaConf, ListConfig
def my_subdir_suffix_impl(
task_overrides: ListConfig, # list[str]: overrides passed at command line
exclude_patterns: ListConfig, # list[str]: regex patterns to exclude
) -> str:
"""Return a sting: concatenation of overrides that are not matched by any of the `exclude_patterns`."""
import re
rets: list[str] = []
for override in task_overrides:
should_exclude = any(
re.search(exc_pat, override) for exc_pat in exclude_patterns
)
if not should_exclude:
rets.append(override)
return "_".join(rets)
OmegaConf.register_new_resolver("my_subdir_suffix", my_subdir_suffix_impl)
@hydra.main(config_path=".", config_name="config")
def main(cfg):
print(f"{os.getcwd()=}")
main()
# config.yaml
params:
a: 4
name: test
app:
verbose: false
config: false
setup: true
my_excludes:
app: app.*
sweep: sweep
self: my_excludes
hydra:
sweep:
dir: ${hydra.job.name} #_${hydra.job.id}
subdir: "${hydra.job.num}_${my_subdir_suffix: ${hydra.overrides.task}, ${oc.dict.values:my_excludes}}"
$ python3 simple_app.py params.a=4,5,6 app.verbose=true '~my_excludes.app' '+my_excludes.p="params.*"' -m
[2021-11-07 11:25:35,072][HYDRA] Launching 3 jobs locally
[2021-11-07 11:25:35,072][HYDRA] #0 : params.a=4 app.verbose=True ~my_excludes.app=null +my_excludes.p="params.*"
os.getcwd()='/home/jasha10/hydra_tmp/tmp1873/simple_app/0_app.verbose=True'
[2021-11-07 11:25:35,161][HYDRA] #1 : params.a=5 app.verbose=True ~my_excludes.app=null +my_excludes.p="params.*"
os.getcwd()='/home/jasha10/hydra_tmp/tmp1873/simple_app/1_app.verbose=True'
[2021-11-07 11:25:35,259][HYDRA] #2 : params.a=6 app.verbose=True ~my_excludes.app=null +my_excludes.p="params.*"
os.getcwd()='/home/jasha10/hydra_tmp/tmp1873/simple_app/2_app.verbose=True'
So many options now! This is really great - I ended up using concat method at the moment - but will re-investigate it later - once I have few other people look at it. One thing which is weird to me:
my_excludes.p
- where is p
defined?self: my_excludes
- how is that working?I also never used oc.dict.values
and hydra.overrides.task
before - they are very neat things - which I will shamelessly stole from you ;-).
I also never used
oc.dict.values
andhydra.overrides.task
before - they are very neat things - which I will shamelessly stole from you ;-).
Haha, good!
you are using
my_excludes.p
- where isp
defined?
I am using one of the techniques from the Modifying the Config Object section of the docs.
At the command line, I typed '+my_excludes.p="params.*"'
. The plus symbol +
takes care of adding the key "p"
to my_excludes
. If the plus were left out then Hydra would fail with an error.
I used the plus here to demonstrate how you can dynamically add keys to my_excludes
using the command line. In this particular example, typing '+my_excludes.p="params.*"'
at the command line prevents override keys starting with "params."
from appearing in the output directory name. Meanwhile, my use of a tilde ~
in the override '~my_excludes.app'
demonstrates how to delete a key from my_excludes
at the CLI.
in the config file you are using
self: my_excludes
- how is that working?
The word "self"
is not special here; I could have used e.g. foobar: my_excludes
instead.
What matters is that the string "my_excludes"
shows up as one of the values in the cfg.my_excludes
DictConfig. This prevents the word "my_excludes"
from appearing in the name of the output directory.
For example, you can try deleting the self: my_excludes
line from the config file to see what the output directory name is.
# With `self: my_excludes` deleted from the config:
$ python3 simple_app.py '~my_excludes.app' -m
...
os.getcwd()='/home/jbss/hydra_tmp/tmp1873/simple_app/0_~my_excludes.app=null'
# With `self: my_excludes` included in the config:
$ python3 simple_app.py '~my_excludes.app' -m
...
os.getcwd()='/home/jbss/hydra_tmp/tmp1873/simple_app/0_'
🚀 Feature Request
If in my command line I would like to exclude
app.verbose
andapp.config
- then inoverride_dirname
I could write one of these to make it work:so above line would give me
override_dirname="params.a=4"
for one of the sweep tasks.Motivation
This would simplify directory naming for runs and would allow better integration with command line functionality.
Is your feature request related to a problem? Please describe. It's related to issue #1872 - that one also using cmd as primary driver for the sweep setup. And overall both of these issues should make commandline functionality synchronized with yaml file experience users currently have.
Pitch
Describe the solution you'd like The solution would allow to exclude based on more generic criteria - which would simplify robustness towards future life of the configuration of the app.
Describe alternatives you've considered The only alternative is not to use it and configure dirname using custom interpolation and logic. But that would limit generalizability - since user would have to modify this logic each time new argument will be considered for sweeping.
Are you willing to open a pull request? no
Additional context
I will take any workaround which would require to modify only main config.yaml file once and for all possible future sweeps or any custom python logic I need to write into my app.
Thanks in Advance!