GrahamDumpleton / wrapt

A Python module for decorators, wrappers and monkey patching.
BSD 2-Clause "Simplified" License
2.03k stars 231 forks source link

How to use adapter factory to change signature depending on instance #263

Open antonymilne opened 3 months ago

antonymilne commented 3 months ago

Hello and thank you for you this amazing library! I started using wrapt because I wanted access to the instance inside my decorator. I'd now like to take this a bit further to use instance in the adapter but not sure if it's possible.

I know that adapter factories can be used to alter a signature at runtime based on wrapped, but is it possible to somehow use instance also? Or am trying to do something that's just too weird here?

Possibly related: https://github.com/GrahamDumpleton/wrapt/issues/232.

Here's some code to illustrate what I'm trying to do. Thank you very much!

import inspect
import wrapt

def adapter(self, a, b):
    pass

def different_adapter(self, c, d):
    pass

# This works great.
# You can also input a factory here but it only has access to wrapped, not instance.
# Is it possible to make an adapter factory where the adapter depends on instance?
@wrapt.decorator(adapter=adapter)
def change_signature(wrapped, instance, args, kwargs):
    pass

# Something like this maybe? But it's not right:
@wrapt.decorator
def change_signature_custom(wrapped, instance, args, kwargs):
    @wrapt.decorator(adapter=instance.adapter)
    def change_signature(wrapped, instance, args, kwargs):
        pass
    return change_signature(wrapped)

class Class:
    def __init__(self, adapter=None):
        self.adapter = adapter

    @change_signature
    def f(self):
        pass

    @change_signature_custom
    def g(self):
        pass

print(f"{inspect.getfullargspec(Class().f)=}") 
print(f"{inspect.signature(Class().f)=}\n")
# Works as expected and correctly gives:
# inspect.getfullargspec(Class().f)=FullArgSpec(args=['self', 'a', 'b'], varargs=None, varkw=None, defaults=None, kwonlyargs=[], kwonlydefaults=None, annotations={})
# inspect.signature(Class().f)=<Signature (a, b)>

print(f"{inspect.getfullargspec(Class(adapter=different_adapter).g)=}")
print(f"{inspect.signature(Class(adapter=different_adapter).g)=}\n")
# I'd like this to use the different_adapter but not sure how to get that working? Currently gives
# inspect.getfullargspec(Class(adapter=different_adapter).g)=FullArgSpec(args=['self'], varargs=None, varkw=None, defaults=None, kwonlyargs=[], kwonlydefaults=None, annotations={})
# inspect.signature(Class(adapter=different_adapter).g)=<Signature ()>
GrahamDumpleton commented 3 months ago

Why do you want to be able to do this?

Overall it doesn't even make sense to me why it would be needed or whether it make sense. This is because signatures in Python are normally based on the unbound method prototype attached to the class and not something that dynamically adapts itself per instance of the class.

antonymilne commented 3 months ago

Thanks for the response @GrahamDumpleton. Let me explain a bit more since I know it does sound an odd thing to do.

I have a Data class that handles data loading in a web app. This can wrap functions that can have arbitrarily different signatures:

class Data:
    def __init__(self, function):
        self.function = function

    # Must have generic *args, **kwargs to be able to handle anything.
    def load(self, *args, **kwargs):
        return self.function(*args, **kwargs)

# One data loading function
def f(something, blah=10):
    pass

# Another data loading function
def g(something_completely_different, another_argument):
    pass

f_data = Data(f)
g_data = Data(g)

At runtime these data loading functions are run with the arguments provided by the user (e.g. through a URL query parameter):

f_data.load(something=1)
g_data.load(something_completely_different="hello", another_argument=2)

So far everything is ok. Next I wanted to incorporate flask-caching in order to memoize the loading calls. Flask-caching enables you to set a timeout, which is an attribute of Data. This is when I turned to using wrapt:

@wrapt.decorator
def memoize(wrapped, instance, args, kwargs):
    return flask_cache(timeout=instance.timeout)(wrapped)(*args, **kwargs)

class DataWithCache:
    def __init__(self, function, timeout):
        self.function = function
        self.timeout = timeout

    @memoize
    def load(self, *args, **kwargs):
        return self.function(*args, **kwargs)

# Set different timeouts for different data
f_data = Data(f, timeout=60)
g_data = Data(g, timeout=120)

The above works great in most contexts, but the following will not generate the same cache key:

f_data.load(something=1)
f_data.load(1)

There is functionality built into flask-caching to make this work correctly, but it relies on inspection of the function being memoized. Unfortunately, this function always appear as load(self, *args, **kwargs) and so the memoization doesn't work correctly. In order to fix it I would like to make the function that's passed through to flask_cache to have the same signature as the original Data.function, like this:

@wrapt.decorator
def memoize(wrapped, instance, args, kwargs):
    # Pseudocode:
    wrapped.signature = instance.function.signature
    return flask_cache(timeout=instance.timeout)(wrapped)(*args, **kwargs)

The problem is I can't find any way to achieve this. wrapt.decorator is (correctly) mirroring the signature load(self, *args, **kwargs), but I'd like some way to change that signature to be the same as the underlying Data.function signature.


Possibly the easiest solution is to just memoize slightly differently, like this:

def memoize(timeout):
    @wrapt.decorator
    def wrapper(wrapped, instance, args, kwargs):
        return flask_cache(timeout=timeout)(wrapped)(*args, **kwargs)
    return wrapper

class Data:
    ...

    # Don't memoize load function itself but instead the contents of the function call.
    def load(self, *args, **kwargs):
        return memoize(self.timeout)(self.function)(*args, **kwargs)

In this case the signature of self.function is maintained and all works well. This might be ok as a solution, but I'd be interested if it is possible to get it working m first way where I memoize the whole load. Of course if you think the structure of my code is just completely wrong here then I'd love to hear better alternatives. Thank you!

GrahamDumpleton commented 3 months ago

I'll have to think about it, but first quick glance through (I could have misunderstood things), you might want to use the special __new__() method of a class to dynamically add the load function on the class specific to each instance based on the argument to __init__(). That way you can initialize the adapter unique to each instance.

If I get a chance later I will try and get together and example of what am thinking of.

GrahamDumpleton commented 3 months ago

BTW, why not just do something like:

class DataWithCache:
    def __init__(self, function, timeout):
        self.function = function
        self.timeout = timeout
        self.load = memoize(self.timeout)(self.function)
antonymilne commented 3 months ago

That would be possible (it's one answer suggested on this question on stackoverflow) but would not work with the following, which is actually how timeout is often set, rather than using __init__ (note that timeout is actually an optional argument in reality):

data = DataWithCache(function)
data.timeout = 10

I could also implement the custom behaviour to ensure that when timeout is set, load is also updated, but it all starts getting a bit messy and non-obvious I think compared to look for a dedicated def load method (especially since there's other classes like DataWithoutCache that have a def load).

GrahamDumpleton commented 3 months ago

If timeout was defined as a property with a setter you could update self.load to reflect late binding of timeout.

antonymilne commented 3 months ago

Indeed that is possible, but it starts getting a bit awkward since in reality there will be more than one argument like timeout:

class DataWithCache:
    def __init__(self, function, timeout, another_argument):
        self.function = function
        self._timeout = timeout
        self._another_argument = another_argument

    @property
    def timeout(self):
        return self._timeout

    @timeout.setter
    def timeout(self, value):
        self._timeout = value
        self.load = memoize(self.timeout)(self.function)

    # Need to repeat this logic for _another_argument.

No doubt this could be made less repetitive using some sort of __setattr__ and the like, but it all starts getting a bit tricky. Especially since, as above, there's other classes like DataWithoutCache that have a def load, so I'd prefer to keep an explicit def load in there if possible.

I don't mean to dismiss all your suggestions here - they're definitely very valid ideas! At the moment I am doing my workaround of memoize(self.timeout)(self.function)(*args, **kwargs) inside the def load and it works ok, I am just wondering whether my original solution would have been possible somehow.

GrahamDumpleton commented 3 months ago

But doesn't creating the memoize wrapper on each load call resulting in all cached data being discarded each time? Doesn't the memoize instance need to persist across calls with the same one being called through each time in order to get what you want?

antonymilne commented 3 months ago

No, it seems to work ok, although now you mention it I am not entirely sure why... Indeed it doesn't feel like the right thing to do for that reason. I suppose that flask-caching memoize doesn't care about the actual function being cached; it just uses the function name and arguments to generate a cache key which will match up across repeated calls even if it's a different memoize instance.