Open omry opened 4 years ago
Probably too big of a change for 2.1 at this point.
Not sure if this is related, but seems likely.
Given a config, containing a custom resolver as_tuple
:
vec:
_target_: sklearn.feature_extraction.text.HashingVectorizer
ngram_range: ${as_tuple:2,3}
pipe:
_target_: sklearn.pipeline.Pipeline
steps:
- - vec
- ${vec}
- - clf
- _target_: sklearn.dummy.DummyClassifier
and an associated script:
import hydra
import hydra.utils as hu
from omegaconf import OmegaConf
def resolve_tuple(*args):
return tuple(args)
OmegaConf.register_new_resolver("as_tuple", resolve_tuple)
@hydra.main(config_name="conf", config_path="conf"
def main(cfg):
obj = hu.instantiate(cfg.pipe, _convert_='partial')
print(obj)
Running this script with the associated config raises the following:
$ python test.py
Error executing job with overrides: []
Traceback (most recent call last):
File "test.py", line 21, in <module>
main()
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/main.py", line 52, in decorated_main
config_name=config_name,
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/utils.py", line 378, in _run_hydra
lambda: hydra.run(
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/utils.py", line 214, in run_and_report
raise ex
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/utils.py", line 211, in run_and_report
return func()
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/utils.py", line 381, in <lambda>
overrides=args.overrides,
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/hydra.py", line 111, in run
_ = ret.return_value
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/core/utils.py", line 233, in return_value
raise self._return_value
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/core/utils.py", line 160, in run_job
ret.return_value = task_function(task_cfg)
File "test.py", line 16, in main
obj = hu.instantiate(cfg.pipe, _convert_='partial')
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/hydra/_internal/instantiate/_instantiate2.py", line 175, in instantiate
OmegaConf.resolve(config)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/omegaconf.py", line 792, in resolve
omegaconf._impl._resolve(cfg)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 40, in _resolve
_resolve_container_value(cfg, k)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 25, in _resolve_container_value
_resolve(node)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 44, in _resolve
_resolve_container_value(cfg, i)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 25, in _resolve_container_value
_resolve(node)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 44, in _resolve
_resolve_container_value(cfg, i)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 19, in _resolve_container_value
_resolve(resolved)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 40, in _resolve
_resolve_container_value(cfg, k)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/_impl.py", line 23, in _resolve_container_value
node._set_value(resolved._value())
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/nodes.py", line 44, in _set_value
self._val = self.validate_and_convert(value)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/nodes.py", line 57, in validate_and_convert
return self._validate_and_convert_impl(value)
File "/Users/gphillips/anaconda3/envs/ds-core/lib/python3.7/site-packages/omegaconf/nodes.py", line 134, in _validate_and_convert_impl
f"Value '{t.__name__}' is not a supported primitive type"
omegaconf.errors.UnsupportedValueType: Value 'tuple' is not a supported primitive type
However, if you modify the python file to only instantiate the vectorizer.
...
def main(cfg):
hu.instantiate(cfg.vec, _convert_='partial')
print(obj)
...
The target instantiates successfully.
$ python test.py
HashingVectorizer(ngram_range=(2, 3))
Is this the intended behavior? It seems the allow_objects
flag for the config is somehow None
when instantiating the vec
object in the nested instantiation causing the error, but I am not sure how or what to do to mitigate that occuring.
Hello again @gphillips-ema :)
Thanks for sharing this example. I believe you're right that there's a bug here. The issue you are having is actually not tuple-specific; the same problem occurs with any other type that is not supported by OmegaConf (i.e. not a dict/list/primitive).
For your use-case, here is a workaround:
vec:
_target_: sklearn.feature_extraction.text.HashingVectorizer
ngram_range:
_target_: builtins.tuple
_args_:
- [2, 3]
pipe:
_target_: sklearn.pipeline.Pipeline
steps:
- - vec
- ${vec}
- - clf
- _target_: sklearn.dummy.DummyClassifier
I've opened an issue against the Hydra repo here: https://github.com/facebookresearch/hydra/issues/1865.
@Jasha10 thanks for the workaround. I had been doing the following:
vec:
_target_: sklearn.feature_extraction.text.HashingVectorizer
ngram_range:
- 2
- 3
This at least instantiates, but I am touch uncomfortable as its a list and not a tuple.
My real motivation for this is to simplify overrides for pipelines that can become complex.
Take this example:
pipe:
_target_: sklearn.pipeline.Pipeline
steps:
- - transformer
- _target_: sklearn.compose.ColumnTransformer
transformers:
- - text_pipeline
- _target_: sklearn.pipeline.Pipeline
steps:
- - vectorizer
- _target_: sklearn.feature_extraction.text.HashingVectorizer
n_features: ${pow:2,18}
ngram_range:
- 1
- 4
- Description
- - classifier
- _target_: sklearn.dummy.DummyClassifier
In this case a user can only override ngram_range min or max individually like so: pipe.steps.0.1.transformers.0.1.steps.0.1.ngram_range.1=5
This could get rather frustrating. But if you can interpolate the vectorizer as follows:
vectorizer:
- _target_: sklearn.feature_extraction.text.HashingVectorizer
n_features: ${pow:2,18}
ngram_range:
- 1
- 4
pipe:
_target_: sklearn.pipeline.Pipeline
steps:
- - transformer
- _target_: sklearn.compose.ColumnTransformer
transformers:
- - text_pipeline
- _target_: sklearn.pipeline.Pipeline
steps:
- - vectorizer
- ${vectorizer}
- Description
- - classifier
- _target_: sklearn.dummy.DummyClassifier
The override is now much simpler as the user just needs to add vectorizer.ngram_range.1=5
.
Being able to also use a tuple instead of single min or max values is just the icing on the cake here.
This pattern of needing an object to instantiate within a target does seem to pop up a fair amount in our pipelines.
Any update or workaround on this? I prefer to use tuples instead of lists in my configs for their immutability.
Any update or workaround on this? I prefer to use tuples instead of lists in my configs for their immutability.
Unlikely to be added any time soon unless someone feels like writing a big PR for it.
Potential workarounds:
as_tuple
resolver as in https://github.com/omry/omegaconf/issues/392#issuecomment-948956968Thanks for the quick reply @odelalleau .
To clarify, I use structured configs (dataclasses) to pass the configs around for type checking.
I temporarily convert to a DictConfig via dict_config = OmegaConf.structured(config)
to merge with cli overwrites (via OmegaConf merge) and then convert back to dataclass via OmegaConf.to_container(dict_config, resolve=True, structured_config_mode=SCMode.INSTANTIATE)
.
With that workflow, the resolver seems not to do anything. Instead, the conversion to a dict_config = OmegaConf.structured(config)
converts the tuples to lists.
With that workflow, the resolver seems not to do anything. Instead, the conversion to a
dict_config = OmegaConf.structured(config)
converts the tuples to lists.
I see -- I'm afraid I'm not entirely sure if there's a way to preserve the tuples in your setup (and no time to look into it closely now).
Looking forward to the feature.
Looking forward to this feature.
Currently tuple is treated as a list.
In addition, it's converted to a mutable sequence:
Another thing this might enable is type safe tuple support:
A cheap alternative to this is to treat tuple as a List and cut some corners with heterogeneous tuples (List[Any]?).
Finally, OmegaConf.to_container() should return a real tuple for input tuples and not a list.
Fixing it probably means the introduction of a TupleConfig class and is potentially a breaking change in some rare cases. This is probably a very large change and I am not convinced it's worth the investment. Please like if you feel this is important.