Closed hudson-ai closed 1 month ago
:warning: Please install the to ensure uploads and comments are reliably processed by Codecov.
Attention: Patch coverage is 96.90722%
with 3 lines
in your changes missing coverage. Please review.
Project coverage is 63.45%. Comparing base (
63bfd9f
) to head (612a810
).
Files with missing lines | Patch % | Lines |
---|---|---|
guidance/_guidance.py | 97.33% | 2 Missing :warning: |
guidance/_utils.py | 95.45% | 1 Missing :warning: |
:exclamation: Your organization needs to install the Codecov GitHub app to enable full functionality.
:exclamation: There is a different number of reports uploaded between BASE (63bfd9f) and HEAD (612a810). Click for more details.
HEAD has 69 uploads less than BASE
| Flag | BASE (63bfd9f) | HEAD (612a810) | |------|------|------| ||137|68|
:umbrella: View full report in Codecov by Sentry.
:loudspeaker: Have feedback on the report? Share it here.
I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...
I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...
Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.
I'm unsatisfied with the behavior of caches here -- they will prevent instances from being garbage collected, and cache entries themselves won't get cleared. Need to tinker with that a bit more...
Maybe use weakref? The cached objects can then live and be used for a while, but will be cleared during a garbage collection if there is too much memory pressure.
My thoughts exactly. Woke up this morning in a cold sweat mumbling to myself "...the garbage collector!" (joking).
Notice that there are essentially two levels of caches here -- one on the decorated function itself which is needed for "grammar node reuse" that wraps the unbound function (note it will receive the self
argument and use it as part of the cache key), and one that sits on the unbound GuidanceFunction
's __get__
, which is needed to ensure that myinstance.func
returns the same GuidanceMethod
each time (so that we can detect/handle recursive calls).
My thought is to essentially use a weakref for the latter cache and then move the other cache to the bound method so that it doesn't need to hold a reference to the instance at all. Make general sense?
Side note, but I'm getting uncomfortable with the fact that we're not setting maxsize
anywhere on our LRU caches. Just asking for memory leaks...
Non-side note: @paulbkoch how do you feel about the spirit of this PR in the first place? Do you think that allowing methods to be decorated is a good thing? 😄
@paulbkoch I finally feel good about the caching. Feel free to take a look on your own time or I'd be happy to jump on a call and do some code review 😄
Had a think on this -- I'm a fan of the concept. Python decorators are allowed to be used everywhere, so it seems quite reasonable to me to make the guidance decorator work on class methods too. I think there is organizational/bookkeeping utility from being able to do this in our own codebase, even if end users don't take advantage of it as much.
@Harsha-Nori glad you're a fan of the idea. Before pushing this through, I want to have a deeper conversation about what is or isn't cached, especially for the sake of recursive calls. Some subtle things going on there, even for vanilla guidance functions (non-methods). Definitely don't want to be adding unnecessary cognitive load for users.
@Harsha-Nori @paulbkoch any objections to merging this?
Biggest "gotcha" I can really think of -- the self
argument will be ignored in the determination of what does/does not constitute a "zero arg recursive call". I.e. modifying the state of self
may have unexpected consequences to the naive user.
This PR does provide a proper way of handling this though: if the object implements a __hash__
method that reflects changing state on the instance, the above gotcha can be avoided.
Merging on verbal review from @Harsha-Nori :)
This PR extends the
@guidance
decorator to work with methods.I believe this will be invaluable when defining grammars that are implemented as several guidance functions that call each other, particularly if they need to share common parameters. I think that rewriting
json
as a class will be a good test case for this, and I expect to be able to simplify the implementation with this new machinery as well as enable some otherwise-tricky shared parameters likeseparators=(", ", ": ")
.To achieve this:
GuidanceFunction
type that is returned when the decorator is called on a function.GuidanceFunction
uses__get__
descriptor magic to ensure that the function is bound to an instance before being wrapped by the decorator's actual implementation (_decorator
).GuidanceFunction
s are given aGuidanceMethod
typeSide note/todo:
_guidance.pyi
is now not strictly correct, but it still gives helpful type hints. It's most-wrong in the case where a user guidance-decorates a method but tries to call it from the class rather than an instance. This case should be broken anyway, but I need to think some more about how to give the user a good error message. IMO, that can wait for a follow-up PR.