Open Jasha10 opened 2 years ago
This could be a really interesting feature. :) One question: does combining the two decorators seem to be a good idea? Imagine something like that:
# Without parentheses, acts like @dataclass decorator
@structured_config
class ObjConf:
foo: int
# With parentheses, store the config into the Config Store instance
@structured_config(name="obj_schema", group="grp")
class ObjConf:
foo: int
And structured_config
must wrap the class into a dataclass, plus store the config into the config store.
What do you think?
EDIT: If we take the example from that page: https://hydra.cc/docs/tutorials/structured_config/config_groups/, we could simplify the code from:
@dataclass
class MySQLConfig:
driver: str = "mysql"
host: str = "localhost"
port: int = 3306
@dataclass
class PostGreSQLConfig:
driver: str = "postgresql"
host: str = "localhost"
port: int = 5432
timeout: int = 10
@dataclass
class Config:
# We will populate db using composition.
db: Any
# Create config group `db` with options 'mysql' and 'postgreqsl'
cs = ConfigStore.instance()
cs.store(name="config", node=Config)
cs.store(group="db", name="mysql", node=MySQLConfig)
cs.store(group="db", name="postgresql", node=PostGreSQLConfig)
@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
print(OmegaConf.to_yaml(cfg))
to:
@structured_config(group='db', name='mysql')
class MySQLConfig:
driver: str = "mysql"
host: str = "localhost"
port: int = 3306
@structured_config(group='db', name='postgresql')
class PostGreSQLConfig:
driver: str = "postgresql"
host: str = "localhost"
port: int = 5432
timeout: int = 10
@structured_config(name='config')
class Config:
# We will populate db using composition.
db: Any
@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
print(OmegaConf.to_yaml(cfg))
Thanks for the suggestion @mayeroa.
That idea certainly could work, though I'd be concerned about tooling support (e.g. mypy and editor completion) if the @dataclass
decorator is not added explicitly.
I noticed that you skipped the ConfigStore.instance()
step in your code above. This is another idea I had for adding convenience to the config store API: allowing users to import a callable from hydra.core.config_store
that allows instant storage without the need for explicit instantiation.
Hi Jasha, thanks for your feedback :). Concerning the tooling support, based on this issue (https://github.com/samuelcolvin/pydantic/pull/2721), I manage to have autocompletion working in VSCode using the __dataclass_transform__ decorator, like this:
# Standard libraries
from dataclasses import dataclass
from typing import Any, Callable, Optional, Tuple, TypeVar, Union
# Third-party libraries
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf
_T = TypeVar('_T')
def __dataclass_transform__(
*,
eq_default: bool = True,
order_default: bool = False,
kw_only_default: bool = False,
field_descriptors: Tuple[Union[type, Callable[..., Any]], ...] = (()),
) -> Callable[[_T], _T]:
return lambda a: a
@__dataclass_transform__(eq_default=True, order_default=False, kw_only_default=True)
def structured_config(
name: Optional[str] = None,
group: Optional[str] = None,
package: Optional[str] = None,
provider: Optional[str] = None,
init: bool = True,
repr: bool = True,
eq: bool = True,
order: bool = False,
unsafe_hash: bool = False,
frozen: bool = False
):
def decorator(cls=None):
def wrapper(cls: Any):
# Wrap class into a dataclass
new_cls = dataclass(cls, init=init, repr=repr, eq=eq, order=order, unsafe_hash=unsafe_hash, frozen=frozen)
# Store structure config into Config Store
if name is not None:
config_store = ConfigStore.instance()
config_store.store(group=group, name=name, package=package, provider=provider, node=new_cls)
return new_cls
# See if we're being called as @structured_config or @structured_config().
if cls is None:
return wrapper
return wrapper(cls)
return decorator
@structured_config(group='db', name='mysql')
class MySQLConfig:
driver: str = "mysql"
host: str = "localhost"
port: int = 3306
@structured_config(group='db', name='postgresql')
class PostGreSQLConfig:
driver: str = "postgresql"
host: str = "localhost"
timeout: int = 10
port: int = 5432
@structured_config(name='config')
class Config:
# We will populate db using composition.
db: Any
@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
print(OmegaConf.to_yaml(cfg))
if __name__ == "__main__":
my_app()
which produces, in VSCode, autocompletion like this:
Need to check if mypy supports such a feature
PS: More information on the specification: https://github.com/microsoft/pyright/blob/main/specs/dataclass_transforms.md
Having a look at mypy, it seems that such a behavior is not supported: https://mypy.readthedocs.io/en/stable/additional_features.html#caveats-known-issues
Mypy effectively complains if we decorate a class by a function returning dataclass
:
I manage to have autocompletion working in VSCode using the dataclass_transform decorator
Very cool!
This seems to have many parallels to Hydra Zen's builds()
function. I would be curious if it's possible to combine the features of both, i.e a decorator which
dataclass
for the structured config based on the type hints of the class's init signature (or function's call signature) - this is the Hydra Zen partcs.store()
so that the dynamic config is automatically made available in the global config storeThat's an interesting idea @addisonklinke. I think there may be a circularity issue that prevents using builds
as a decorator. Check out this example:
>>> import hydra_zen
>>> @hydra_zen.builds
... def foo(x: int, y: str = "abc") -> str:
... return str(x) + y
...
>>> foo
<class 'types.Builds_foo'>
>>> instantiate(foo)
Builds_foo(_target_='__main__.foo')
>>> instantiate(instantiate(foo))
Builds_foo(_target_='__main__.foo')
>>> instantiate(instantiate(instantiate(foo)))
Builds_foo(_target_='__main__.foo')
>>> instantiate(foo)(x=123,y="xyz")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'Builds_foo' object is not callable
No matter how many times you call instantiate
, you always get back an instance of Builds_Foo
(rather than getting the original function foo
of type Callable[[int, str], str]
). This is because the original function is shadowed by the dataclass with the same name; _target_='__main__.foo'
points to the dataclass when the intended behavior would be for _target_
to refer to the underlying function.
@Jasha10 Hydra-Zen doesn't currently support builds
as a decorator, so I think it's expected that you'll run into issues there
I probably should've linked this request where they've discussed adding that functionality, The main idea would be storing the output of builds(foo)
in an object attribute like foo.__hydra_config__
which can later be accessed by instatiate
or ConfigStore.store()
For those interested, hydra-zen is implementing a feature along these lines, which permits a decorator pattern for adding configs to Hydra's store: https://github.com/mit-ll-responsible-ai/hydra-zen/pull/331
🚀 Feature Request:
@cs.store
decoratorUsing the
ConfigStore.store
API to register structured configs results in code that looks like this:What if
cs.store
could act as a decorator?The proposed API would allow the following, which should be strictly equivalent in behavior to the above: