mit-ll-responsible-ai / hydra-zen

Create powerful Hydra applications without the yaml files and boilerplate code.
https://mit-ll-responsible-ai.github.io/hydra-zen/
MIT License
315 stars 13 forks source link

New feature: `hydra_zen.builds_decorator` #93

Closed jgbos closed 2 years ago

jgbos commented 3 years ago

We had a discussion on Teams about possibly adding a builds_decorator of the form:

def builds_decorator(f):
    Conf = builds(f, populate_full_sig=True)
    f.config = Conf
    return f

Then we can automatically build configs for a function or class. For example, to get the config for the following function,

@builds_decorator
def foo(x, val=1):
  return x, val

simply access the config attribute,

FooConfig = foo.config(val=2)
rsokl commented 3 years ago

I like this idea and have been trying to think of ways to make builds a bit more ergonomic along these lines. That being said, it feels a bit "risky" to actually ship this:

image

Maybe we could simply document this as a recipe for library authors to use, but they are ultimately responsible for opting-in/maintaining it. Like, I can definitely imagine mushin implementing this.

jgbos commented 3 years ago

Good point. I just remember we were talking about so thought I'd document our discussion at least :)

rsokl commented 3 years ago

Yep! I think it is good to document our thoughts on these sorts of things. It can be also helpful for us to reference these issues if users request similar features later.

rsokl commented 2 years ago

@jgbos what do you think about this approach

from functools import partial

def as_builds(target):
    return partial(builds, target, populate_full_signature=True)

RandomCrop = as_builds(transforms.RandomCrop)
RandomHorizontalFlip = as_builds(transforms.RandomHorizontalFlip)
ColorJitter = as_builds(transforms.ColorJitter)
RandomRotation = as_builds(transforms.RandomRotation)
ToTensor = as_builds(transforms.ToTensor)
Normalize = as_builds(transforms.Normalize)
Compose = as_builds(transforms.Compose)

Cifar10TrainTransformsConf = Compose(
    transforms=[
        RandomCrop(size=32, padding=4),
        RandomHorizontalFlip(),
        ColorJitter(brightness=0.25, contrast=0.25, saturation=0.25),
        RandomRotation(degrees=2),
        ToTensor(),
        Normalize(
            mean=[0.4914, 0.4822, 0.4465],
            std=[0.2023, 0.1994, 0.2010],
        ),
    ],
)

A really cool thing about this is that you can just copy-and-paste your code from your jupyter notebook, and the code that was building your transform is now building a conf for that same transform!

>>> instantiate(Cifar10TrainTransformsConf)
Compose(
    RandomCrop(size=(32, 32), padding=4)
    RandomHorizontalFlip(p=0.5)
    ColorJitter(brightness=[0.75, 1.25], contrast=[0.75, 1.25], saturation=[0.75, 1.25], hue=None)
    RandomRotation(degrees=[-2.0, 2.0], interpolation=nearest, expand=False, fill=0)
    ToTensor()
    Normalize(mean=[0.4914, 0.4822, 0.4465], std=[0.2023, 0.1994, 0.201])
)

Obviously all of the thing = as_builds(thing) isn't super ideal. But maybe there is a clever way to deal with this (like make_builds_namespace(transforms) returns a namespace of all the public callables, but with as_builds applied to them?)

Some cool things about this: our runtime validation ends up being really useful and intuitive here:

>>> Normalize(average=2)
TypeError: Building: Normalize ..
The following unexpected keyword argument(s) was specified for torchvision.transforms.transforms.Normalize via `builds`: average
jgbos commented 2 years ago

I guess I'm not sure how this really helps (other than the validation). Are you saying that since a project is going to use hydra-zen it might as well convert the project's classes/funcs to output builds objects? Or is this just done in the configs.py?

rsokl commented 2 years ago

Or is this just done in the configs.py?

Yep, I am thinking that end-users could do something like this in configs.py. That way writing the configs is literally as easy as initializing the object itself.

Although it would be super slick if we could do something like:

from torchvisions import transforms

#  some magic where we auto-apply `as_builds` to everything in `transforms.__all__` and then
# return a corresponding namespace that exposes the config counterparts
cfg = hydrate(transforms)  

cfg.Normalize  # is actually partial(builds, transforms.Normalize)

and in this way, the library maintainers don't have to do anything at all to make it easy to configure their stuff.

jgbos commented 2 years ago

👍 to hydrate(transforms)

I like that. I always hate that tab completion for imports would have Normalize and NormalizeCfg.

rsokl commented 2 years ago

The biggest blocker for this is getting static tooling to play nice. Right now, I can't think of any way to get hydrate(transforms) to expose any accurate auto-completion information. So people would be flying blind with this feature.

Closing this for the time being. Hopefully https://github.com/mit-ll-responsible-ai/hydra-zen/pull/129 gets most of the job done.

addisonklinke commented 2 years ago

potential of colliding with objects that already have a config method

@rsokl What about a runtime check in the decorator - does that sufficiently handle the collision downside?

def builds_decorator(f):
    if hasattr(f, 'config'):
        raise RuntimeError(f'{f.__name__}.config already exists, refusing to override with decorator')
    f.config = builds(f, populate_full_signature=True)
    return f  

Maybe you would also need to wrap f.config in a read-only class so that future access (i.e. dynamically setting config on an initialized class instance) cannot delete or change the attribute populated by the decorator

addisonklinke commented 2 years ago

@rsokl @jgbos regarding hydrate(), have you seen the hydra-torch package by @romesco? I think you all are tackling a similar problem but with slightly different approaches, and I'm excited to see where this goes!

rsokl commented 2 years ago

What about a runtime check in the decorator - does that sufficiently handle the collision downside?

@addisonklinke That runtime check looks good to me. We could also name the attribute something like hydra_config which is far less-likely to collide with a pre-existing attribute of the same name. I also agree that making this a read-only attribute is worth considering.

For now, I don't envision adding builds_decorator to hydra-zen; given that it is such a thin wrapper around builds, interested users can pretty easily implement it on their end. If you are one of this interested users, I would be delighted to get your feedback on how this works for you. If this proves to be an especially fruitful decorator (e.g. it makes simpler the process of authoring library code that can also be easily configured in a Hydra app), then I will happily add it to hydra-zen.

regarding hydrate()

You might be interested in my crude prototype for hydra_zen.like. As described in the linked PR, this could potentially be encapsulate the proposed functionality of hydrate. The major blocker for both like and hydrate is getting the type-annotations right. Put plainly: I don't want to ship features that will create red squiggles in IDEs wherever they are used.

have you seen the hydra-torch package

We are familiar with hydra-torch, but thanks for pointing us to it! We had considered using it prior to our creating hydra-zen.

One issue with an approach like hydra-torch is that the notion of spinning up a similar codebase for other third-party libraries simply is too cumbersome and requires too much repetition. Additionally, for some torch objects -- especially optimizers -- we often want to leverage partial instantiation, which isn't immediately possible with hydra-torch's configs.

addisonklinke commented 2 years ago

I would be delighted to get your feedback on how this works for you. If this proves to be an especially fruitful decorator (e.g. it makes simpler the process of authoring library code that can also be easily configured in a Hydra app), then I will happily add it to hydra-zen

I have built out my own, and so far I really like how it's working out. I've been trying to think of an appropriate term for what it enables, and I think the closest is "object oriented packaging". I've written the decorator in such a way that my config groups mirror the hierarchy of the application's Python package. This makes it very intuitive to decide where certain class definitions should live and a new user could also understand the package quickly by reading the CLI --help

Portions of the design have turned out to be more complex than I thought, and it would be great to discuss some more concrete details with you or any of the other Hydra-Zen authors. I'm actually presenting my tech stack at a MLOps.Community meetup and will focus heavily on the configuration system I built to power an experimentation-friendly PyTorch Lightning application with Hydra-Zen. If you all are interested, please hop in their Slack workspace and shoot me a DM - I'll get you added to the Google Meet invite. Anyone else reading this issue thread is also welcome to join

You might be interested in my crude prototype for hydra_zen.like

Thanks for the pointer! I'll continue on that PR thread

One issue with an approach like hydra-torch is that the notion of spinning up a similar codebase for other third-party libraries simply is too cumbersome

I also find it a little too complex to require a new pip install hydra-* every time you want to use configs from a 3rd party library. Being able to call hydrate dynamically inside your application code would be much more flexible

rsokl commented 2 years ago

I have built out my own, and so far I really like how it's working out. I've been trying to think of an appropriate term for what it enables, and I think the closest is "object oriented packaging". I've written the decorator in such a way that my config groups mirror the hierarchy of the application's Python package. This makes it very intuitive to decide where certain class definitions should live and a new user could also understand the package quickly by reading the CLI --help

This sounds great! I'd love to see what this design looks like. Perhaps you could point me to the code, or we could arrange a video call sometime so that we can chat about it. Ultimately, I am quite interested in facilitating this sort of design for library authors who want to create "Hydra-aware" code.

addisonklinke commented 2 years ago

Great looking forward to discussing further at the meetup!