amakelov / mandala

A simple & elegant experiment tracking framework that integrates persistence logic & best practices directly into Python
Apache License 2.0
468 stars 11 forks source link

Expected a function or method, but got `<class 'method_descriptor'>` #14

Open iamrecursion opened 3 months ago

iamrecursion commented 3 months ago

This is a super cool idea that I can see being amazingly useful! I've been testing it on one of my research projects, and it seems like it's failing out with the above error.

The project is quite big, so if you can point me to some idea of where to look for a cause I'll do my best to help provide some more info!

amakelov commented 3 months ago

Hi! Can you provide an example causing this behavior?

iamrecursion commented 3 months ago

I've managed to create a small one! You need beartype and Mandala as dependencies. I was using beartype 0.18.2 when I created this minimal reproducer, but in the real project I was on 0.17.2 so I do not think it matters too much!

from beartype import beartype
from mandala.imports import Storage, op

storage = Storage(deps_path='__main__')

@beartype
class BugExample:

    my_var: int

    @op
    def __init__(self, init_value: int) -> None:
        self.my_var = init_value

    @op
    def add_to(self, value: int) -> int:
        old = self.my_var
        self.my_var += value
        return old

with storage.run():
    my_thing = BugExample(10)
    my_thing.add_to(10)

I've presented this as python here, but it's actually a cell in an IPython notebook for the reproducer.

amakelov commented 3 months ago

I see - unfortunately, this is (currently) the expected behavior, because class methods are not supported by @op.

This is because mandala is (currently) based around a "functional" style - caching function calls and building computations out of function composition - as opposed to an "object-oriented" style where you could cache method calls too. There are many reasons for this; mostly, it's because it is geared towards data science / ML applications where it's more typical to chain data transformations and analyses as opposed to update some state (the big exception being model training).

As a consequence, there are parts of the code that assume that @op-decorated callables behave like pure functions, which is not the case for the idiomatic use of methods in OOP (if none of your methods update the object state, why even use methods at all?). This may or may not be fixed in a future version coming soon :)

In the meantime, anything expressible via OOP can be (awkwardly) expressed using pure functions, e.g.

@op # convert `__init__` into pure method returning the *state* of the instance (not the mutable object itself)
def init_BugExample(init_value: int) -> int:
    return init_value

@op # convert methods to pure functions operating over instance state, and return the new state
def add_to(bug_example_data: int, value: int) -> Tuple[int, int]:
    # in more complex scenarios, you may have to do some copying here to keep the function pure, etc...
    return bug_example_data + value, value

with storage.run():
    my_thing = init_BugExample(10)
    _, my_thing = add_to(my_thing, 10)

I'd love to hear about use cases involving caching calls to object methods.

iamrecursion commented 3 months ago

Ah, makes sense! It might be worth being more clear about this and having a better error message while it's explicitly not supported!

In terms of my use-case, the project I've been testing this on consists of work with cryptographic hash functions. In particular, these hash functions usually have a state which I model as a class with accompanying operations on that state.

Due to the way I'm running analysis—where it is very useful to be able to inspect individual state transformations—it could be very beneficial to cache individual transformations. I would imagine this working by storing copies of self using deepcopy or similar, and forwarding those as needed.