facebookresearch / hydra

Hydra is a framework for elegantly configuring complex applications
https://hydra.cc
MIT License
8.67k stars 624 forks source link

[Feature Request] ConfigStore.store to act as decorator #1982

Open Jasha10 opened 2 years ago

Jasha10 commented 2 years ago

🚀 Feature Request: @cs.store decorator

Using the ConfigStore.store API to register structured configs results in code that looks like this:

from dataclasses import dataclass
from hydra.core.config_store import ConfigStore

cs = ConfigStore.instance()

@dataclass
class ObjConf:
    foo: int

cs.store(name="obj_schema", node=ObjConf, group="grp")

What if cs.store could act as a decorator?

The proposed API would allow the following, which should be strictly equivalent in behavior to the above:

from dataclasses import dataclass
from hydra.core.config_store import ConfigStore

cs = ConfigStore.instance()

@cs.store(name="obj_schema", group="grp")
@dataclass
class ObjConf:
    foo: int
mayeroa commented 2 years ago

This could be a really interesting feature. :) One question: does combining the two decorators seem to be a good idea? Imagine something like that:

# Without parentheses, acts like @dataclass decorator
@structured_config 
class ObjConf:
    foo: int

# With parentheses, store the config into the Config Store instance
@structured_config(name="obj_schema", group="grp")
class ObjConf:
    foo: int

And structured_config must wrap the class into a dataclass, plus store the config into the config store. What do you think?

EDIT: If we take the example from that page: https://hydra.cc/docs/tutorials/structured_config/config_groups/, we could simplify the code from:

@dataclass
class MySQLConfig:
    driver: str = "mysql"
    host: str = "localhost"
    port: int = 3306

@dataclass
class PostGreSQLConfig:
    driver: str = "postgresql"
    host: str = "localhost"
    port: int = 5432
    timeout: int = 10

@dataclass
class Config:
    # We will populate db using composition.
    db: Any

# Create config group `db` with options 'mysql' and 'postgreqsl'
cs = ConfigStore.instance()
cs.store(name="config", node=Config)
cs.store(group="db", name="mysql", node=MySQLConfig)
cs.store(group="db", name="postgresql", node=PostGreSQLConfig)

@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
    print(OmegaConf.to_yaml(cfg))

to:

@structured_config(group='db', name='mysql')
class MySQLConfig:
    driver: str = "mysql"
    host: str = "localhost"
    port: int = 3306

@structured_config(group='db', name='postgresql')
class PostGreSQLConfig:
    driver: str = "postgresql"
    host: str = "localhost"
    port: int = 5432
    timeout: int = 10

@structured_config(name='config')
class Config:
    # We will populate db using composition.
    db: Any

@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
    print(OmegaConf.to_yaml(cfg))
Jasha10 commented 2 years ago

Thanks for the suggestion @mayeroa. That idea certainly could work, though I'd be concerned about tooling support (e.g. mypy and editor completion) if the @dataclass decorator is not added explicitly. I noticed that you skipped the ConfigStore.instance() step in your code above. This is another idea I had for adding convenience to the config store API: allowing users to import a callable from hydra.core.config_store that allows instant storage without the need for explicit instantiation.

mayeroa commented 2 years ago

Hi Jasha, thanks for your feedback :). Concerning the tooling support, based on this issue (https://github.com/samuelcolvin/pydantic/pull/2721), I manage to have autocompletion working in VSCode using the __dataclass_transform__ decorator, like this:

# Standard libraries
from dataclasses import dataclass
from typing import Any, Callable, Optional, Tuple, TypeVar, Union

# Third-party libraries
import hydra
from hydra.core.config_store import ConfigStore
from omegaconf import OmegaConf

_T = TypeVar('_T')

def __dataclass_transform__(
    *,
    eq_default: bool = True,
    order_default: bool = False,
    kw_only_default: bool = False,
    field_descriptors: Tuple[Union[type, Callable[..., Any]], ...] = (()),
) -> Callable[[_T], _T]:
    return lambda a: a

@__dataclass_transform__(eq_default=True, order_default=False, kw_only_default=True)
def structured_config(
    name: Optional[str] = None,
    group: Optional[str] = None,
    package: Optional[str] = None,
    provider: Optional[str] = None,
    init: bool = True,
    repr: bool = True,
    eq: bool = True,
    order: bool = False,
    unsafe_hash: bool = False,
    frozen: bool = False
):
    def decorator(cls=None):
        def wrapper(cls: Any):
            # Wrap class into a dataclass
            new_cls = dataclass(cls, init=init, repr=repr, eq=eq, order=order, unsafe_hash=unsafe_hash, frozen=frozen)            
            # Store structure config into Config Store
            if name is not None:
                config_store = ConfigStore.instance()
                config_store.store(group=group, name=name, package=package, provider=provider, node=new_cls)
            return new_cls

        # See if we're being called as @structured_config or @structured_config().
        if cls is None:
            return wrapper
        return wrapper(cls)
    return decorator

@structured_config(group='db', name='mysql')
class MySQLConfig:
    driver: str = "mysql"
    host: str = "localhost"
    port: int = 3306

@structured_config(group='db', name='postgresql')
class PostGreSQLConfig:
    driver: str = "postgresql"
    host: str = "localhost"
    timeout: int = 10
    port: int = 5432

@structured_config(name='config')
class Config:
    # We will populate db using composition.
    db: Any

@hydra.main(config_path=None, config_name="config")
def my_app(cfg: Config) -> None:
    print(OmegaConf.to_yaml(cfg))

if __name__ == "__main__":
    my_app()

which produces, in VSCode, autocompletion like this:

Capture d’écran 2022-01-24 à 09 42 32

Need to check if mypy supports such a feature

PS: More information on the specification: https://github.com/microsoft/pyright/blob/main/specs/dataclass_transforms.md

mayeroa commented 2 years ago

Having a look at mypy, it seems that such a behavior is not supported: https://mypy.readthedocs.io/en/stable/additional_features.html#caveats-known-issues

Mypy effectively complains if we decorate a class by a function returning dataclass:

Capture d’écran 2022-01-24 à 11 59 47

Jasha10 commented 2 years ago

I manage to have autocompletion working in VSCode using the dataclass_transform decorator

Very cool!

addisonklinke commented 2 years ago

This seems to have many parallels to Hydra Zen's builds() function. I would be curious if it's possible to combine the features of both, i.e a decorator which

  1. Dynamically creates a dataclass for the structured config based on the type hints of the class's init signature (or function's call signature) - this is the Hydra Zen part
  2. Accepts the same keyword arguments as cs.store() so that the dynamic config is automatically made available in the global config store
Jasha10 commented 2 years ago

That's an interesting idea @addisonklinke. I think there may be a circularity issue that prevents using builds as a decorator. Check out this example:

>>> import hydra_zen
>>> @hydra_zen.builds
... def foo(x: int, y: str = "abc") -> str:
...     return str(x) + y
...
>>> foo
<class 'types.Builds_foo'>
>>> instantiate(foo)
Builds_foo(_target_='__main__.foo')
>>> instantiate(instantiate(foo))
Builds_foo(_target_='__main__.foo')
>>> instantiate(instantiate(instantiate(foo)))
Builds_foo(_target_='__main__.foo')
>>> instantiate(foo)(x=123,y="xyz")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'Builds_foo' object is not callable

No matter how many times you call instantiate, you always get back an instance of Builds_Foo (rather than getting the original function foo of type Callable[[int, str], str]). This is because the original function is shadowed by the dataclass with the same name; _target_='__main__.foo' points to the dataclass when the intended behavior would be for _target_ to refer to the underlying function.

addisonklinke commented 2 years ago

@Jasha10 Hydra-Zen doesn't currently support builds as a decorator, so I think it's expected that you'll run into issues there

I probably should've linked this request where they've discussed adding that functionality, The main idea would be storing the output of builds(foo) in an object attribute like foo.__hydra_config__ which can later be accessed by instatiate or ConfigStore.store()

rsokl commented 1 year ago

For those interested, hydra-zen is implementing a feature along these lines, which permits a decorator pattern for adding configs to Hydra's store: https://github.com/mit-ll-responsible-ai/hydra-zen/pull/331