Open monney opened 2 years ago
Thanks for the feature request, @monney. See this comment for a workaround.
In the proposed change, the returned config would instead be:
I don't think we can implement your proposal as written because it would be a breaking change, but I hope that in the future we can introduce a new defaults list keyword (e.g. concatenate
) to achieve what you suggest:
defaults:
- concatenate elements: [foo, fizz]
Edit: or some other mechanism for composing a list via defaults
.
@Jasha10 Thank you! That does seem like a reasonable proposal, however this feature appears like it produced an error until it was fixed in 1.1.1 (#1724). I wonder if this would still be considered a breaking change ? Specifically I'm asking when this bug was fixed, is it possible that this composition was not intended to occur like this in the first place and is thereby a bug?
I believe that #1724 is actually a different issue. If my understanding is correct, that issue is referring to the case where the defaults list is used to load a yaml file that contains a top-level yaml sequence. Meanwhile, this issue is highlighting the case where a defaults list contains a key/value pair where the value is itself a list.
@Jasha10 It is slightly different but, list behavior was broken by the same item which blocked the top level yaml sequence. I tried my minimal example on the older hydra versions and get the same error. You just couldn't load yaml files if the top most element was part of a list or sequence.
I would like to add my opinion to this issue:
I often have multiple configurations in the same folder, which are (usually) mutually exclusive, but not always, e.g.:
optim
| - adam.yaml
| - sgd.yaml
Where a single yaml
is simply:
_target_: torch.optim.Adam
lr: 0.003
etc: ...
Where its makes sense for my root level config to select both elements, as a list:
defaults:
- optim: [adam, sgd]
With the desired (equivalent) config being:
optim:
- _target_: torch.optim.Adam
lr: ...
- _target_: torch.optim.SGD
lr: ...
But this is hard to do using current default list semantics.
IMO using the [<config1>, <config2>]
syntax for selecting multiple configurations is wrong and misleading.
What you are actually doing is:
defaults:
- optim: {adam, sgd}
(Also known as yaml set notation, which introduces some other particularities, but let's ignore these right now.)
I think the main issue is that dictionary and list syntax is being mixed by hydra: the defaults list is not a list, it is a dictionary; i.e. the output of:
defaults:
- foo: bar
is not [{foo: bar}]
but {foo: bar}
.
I do not immediately see why the defaults list should not be either a dictionary or list except for legacy reasons in the implementation, but foregoing this, I think at least hydra should stick to the appropriate yaml syntax.
Thanks for the feedback, @robogast!
the defaults list is not a list, it is a dictionary ...
One important reason for having defaults
be a sequence rather than a mapping is to preserve order. Yaml mappings are not guaranteed to preserve order, while yaml sequences are. Using a sequence allows for a guaranteed difference between defaults: [foo, bar]
and defaults: [bar, foo]
.
You are also right about the legacy reasons.
Thanks for the tip about "yaml set notation". I didn't know about that.
>>> import yaml
>>> yaml.safe_load("{foo, bar}")
{'foo': None, 'bar': None}
>>> yaml.safe_load("{foo, bar: baz}")
{'foo': None, 'bar': 'baz'}
I would also like to do this from the command line somehow!
For example: python train.py optim+=adam
To whatever the default optim.lr
entries are, I want to add more (list concatenation), to achieve the same effect as described above by @robogast, but from the command line.
is anyone looking into this?
🚀 Feature Request
Currently when loading multiple configs from a config group, a top-level list in the config files is not composed in a useful way.
Motivation
When you have a list of possible options that you want to construct, such as for example choosing multiple callbacks among a group of callbacks. Ideally you might want to construct the list by composing configs of list elements.
for example the following:
config.yaml
with the
elements
directory having two files:foo.yaml
fizz.yaml
This returns the composed config:
Pitch
I think a good thing to do here would instead to compose these configs as if they had literally been written in the same yaml file, which what is sort of what I would expect to happen if we load multiple configs from the same group.
In the proposed change, the returned config would instead be:
The current alternative is every subset of elements you could want, needs a different config, or you have to name each element, load the elements and then use interpolation to construct the list separately, this is quite clunky, and bloats the final config file.
Additional context
Hydra versions tested: 1.1.1 and main branch from github
I think this might have actually worked as proposed in a previous version of Hydra, because I have a few configs written with this structure which I remember using, but I can't seem to find which version had it working like this.