guidance-ai / guidance

A guidance language for controlling large language models.
MIT License
19.12k stars 1.04k forks source link

[Feature] Allow `@guidance` to decorate methods #1035

Closed hudson-ai closed 1 month ago

hudson-ai commented 1 month ago

This PR extends the @guidance decorator to work with methods.

I believe this will be invaluable when defining grammars that are implemented as several guidance functions that call each other, particularly if they need to share common parameters. I think that rewriting json as a class will be a good test case for this, and I expect to be able to simplify the implementation with this new machinery as well as enable some otherwise-tricky shared parameters like separators=(", ", ": ").

To achieve this:

Side note/todo: _guidance.pyi is now not strictly correct, but it still gives helpful type hints. It's most-wrong in the case where a user guidance-decorates a method but tries to call it from the class rather than an instance. This case should be broken anyway, but I need to think some more about how to give the user a good error message. IMO, that can wait for a follow-up PR.

codecov-commenter commented 1 month ago

:warning: Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 96.90722% with 3 lines in your changes missing coverage. Please review.

Project coverage is 63.45%. Comparing base (63bfd9f) to head (612a810).

Files with missing lines Patch % Lines
guidance/_guidance.py 97.33% 2 Missing :warning:
guidance/_utils.py 95.45% 1 Missing :warning:

:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.

:exclamation: There is a different number of reports uploaded between BASE (63bfd9f) and HEAD (612a810). Click for more details.

HEAD has 69 uploads less than BASE | Flag | BASE (63bfd9f) | HEAD (612a810) | |------|------|------| ||137|68|
Additional details and impacted files ```diff @@ Coverage Diff @@ ## main #1035 +/- ## ========================================== - Coverage 71.39% 63.45% -7.94% ========================================== Files 63 63 Lines 4670 4731 +61 ========================================== - Hits 3334 3002 -332 - Misses 1336 1729 +393 ```

:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.

hudson-ai commented 1 month ago

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

paulbkoch commented 1 month ago

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.

hudson-ai commented 1 month ago

I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...

Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.

My thoughts exactly. Woke up this morning in a cold sweat mumbling to myself "...the garbage collector!" (joking).

Notice that there are essentially two levels of caches here -- one on the decorated function itself which is needed for "grammar node reuse" that wraps the unbound function (note it will receive the self argument and use it as part of the cache key), and one that sits on the unbound GuidanceFunction's __get__, which is needed to ensure that myinstance.func returns the same GuidanceMethod each time (so that we can detect/handle recursive calls).

My thought is to essentially use a weakref for the latter cache and then move the other cache to the bound method so that it doesn't need to hold a reference to the instance at all. Make general sense?

hudson-ai commented 1 month ago

Side note, but I'm getting uncomfortable with the fact that we're not setting maxsize anywhere on our LRU caches. Just asking for memory leaks...

hudson-ai commented 1 month ago

Non-side note: @paulbkoch how do you feel about the spirit of this PR in the first place? Do you think that allowing methods to be decorated is a good thing? 😄

hudson-ai commented 1 month ago

@paulbkoch I finally feel good about the caching. Feel free to take a look on your own time or I'd be happy to jump on a call and do some code review 😄

Harsha-Nori commented 1 month ago

Had a think on this -- I'm a fan of the concept. Python decorators are allowed to be used everywhere, so it seems quite reasonable to me to make the guidance decorator work on class methods too. I think there is organizational/bookkeeping utility from being able to do this in our own codebase, even if end users don't take advantage of it as much.

hudson-ai commented 1 month ago

@Harsha-Nori glad you're a fan of the idea. Before pushing this through, I want to have a deeper conversation about what is or isn't cached, especially for the sake of recursive calls. Some subtle things going on there, even for vanilla guidance functions (non-methods). Definitely don't want to be adding unnecessary cognitive load for users.

hudson-ai commented 1 month ago

@Harsha-Nori @paulbkoch any objections to merging this?

Biggest "gotcha" I can really think of -- the self argument will be ignored in the determination of what does/does not constitute a "zero arg recursive call". I.e. modifying the state of self may have unexpected consequences to the naive user.

This PR does provide a proper way of handling this though: if the object implements a __hash__ method that reflects changing state on the instance, the above gotcha can be avoided.

hudson-ai commented 1 month ago

Merging on verbal review from @Harsha-Nori :)