exaloop / codon

A high-performance, zero-overhead, extensible Python compiler using LLVM
https://docs.exaloop.io/codon
Other
14.32k stars 502 forks source link

Caching / Memoization of Functions #337

Open lewisfogden opened 1 year ago

lewisfogden commented 1 year ago

I have tried writing a cache decorator (albeit with needing to define the cache data holder separately). However when running this, only external calls are accessing the decorated function - maybe because the function is being compiled before decorated?

Minimal example below using something along the lines of fibonacci.

Bit stuck - is there a way to do memoization? (ideally without creating a separate store - I've tried using a class but can't see how to make it generic for the function parameter args and return type)

# memoization / caching with codon (functools.cache etc are not developed yet)

cache_store = Dict[int, int]()   # need to manually configure cache for now.

def cache(func: function):
    print(f"Caching {func}")
    def inner(val: int) -> int:
        if val in cache_store:
            print(f"Getting {val} from cache")
            return cache_store[val]
        else:
            print(f"Calculating {val}")
            result: int = func(val)
            cache_store[val] = result
            return result
    return inner

@cache
def fibo(n: int) -> int:
    print(f"fibo calculating {n}")
    if n < 2:
        return n
    else:
        return fibo(n - 1) + fibo(n - 2)    ## << not accessing decorated function

print("*"*20)

f4 = fibo(4)
print(f"{f4=} : {cache_store=}")

print("*"*20)

f6 = fibo(6)
print(f"{f6=} : {cache_store=}")

print("*"*20)

f6 = fibo(6)
print(f"{f6=} : {cache_store=}")

Result:

********************
Calculating 4
fibo calculating 4
fibo calculating 3
fibo calculating 2
fibo calculating 1
fibo calculating 0
fibo calculating 1
fibo calculating 2
fibo calculating 1
fibo calculating 0
f4=3 : cache_store={4: 3}
********************
Calculating 6
fibo calculating 6
fibo calculating 5
fibo calculating 4
fibo calculating 3
fibo calculating 2
fibo calculating 1
fibo calculating 0
fibo calculating 1
fibo calculating 2
fibo calculating 1
fibo calculating 0
fibo calculating 3
fibo calculating 2
fibo calculating 1
fibo calculating 0
fibo calculating 1
fibo calculating 4
fibo calculating 3
fibo calculating 2
fibo calculating 1
fibo calculating 0
fibo calculating 1
fibo calculating 2
fibo calculating 1
fibo calculating 0
f6=8 : cache_store={4: 3, 6: 8}
********************
Getting 6 from cache
f6=8 : cache_store={4: 3, 6: 8}
elisbyberi commented 1 year ago

@lewisfogden I think that this topic has already been discussed in #288.

lewisfogden commented 1 year ago

@lewisfogden I think that this topic has already been discussed in #288.

Hi, it is similar but different. Tail call optimisation is only useful in a few limited cases. Function result caching is much more general and avoids duplicated calculations. (See functions.cache)

Unfortunately a lot of functools in python depends on run-time introspection & ducktyping so probably not an easy library to replicate.

elisbyberi commented 1 year ago

Hi @lewisfogden,

The Python decorator syntax is essentially a shortcut for applying functions or classes as decorators to other functions or classes. Here's how it works:

cache_store = None

def inner(f, val, T: type):
    global cache_store
    if not cache_store:
        cache_store = Dict[type(val), T]()

    if val in cache_store:
        print(f"Getting {val} from cache")
        return cache_store[val]
    else:
        print(f"Calculating {val}")
        result = f(val)
        cache_store[val] = result
        return result

def _fibo(n):
    print(f"fibo calculating {n}")
    if n < 2:
        return n
    else:
        return inner(_fibo, n - 1, int) + inner(_fibo, n - 2, int)

def fibo(n):
    return inner(_fibo, n, int)

print("*" * 20)

f4 = fibo(4)
print(f"{f4=} : {cache_store=}")

print("*" * 20)

f6 = fibo(6)
print(f"{f6=} : {cache_store=}")

print("*" * 20)

f6 = fibo(6)
print(f"{f6=} : {cache_store=}")
lewisfogden commented 1 year ago

@elisbyberi thanks for the suggestion 👍

I understand decorators, e.g.

@decor
def foo():
    pass

is the same as:

def foo():
    pass
foo = decor(foo)

There are a few options, but none are as useful as a proper cache function (for reference, here is a Julia implementation, which relies on macros: https://github.com/JuliaCollections/Memoize.jl )

Could force it (messily) to decorate the inner, which works

def fibo(n: int) -> int:
    fibo = cache(fibo)   # eeek! called each time fibo is called, adds overhead.
    print(f"fibo calculating {n}")
    if n < 2:
        return n
    else:
        return fibo(n - 1) + fibo(n - 2)
fibo = cache(fibo)   # and finally apply to outer

Could just avoid decorators all together (they seem too dynamic for Codon), adds boilerplate

cache = Dict[int, int]()

def fib(n):
    if n in cache:
        return cache[n]
    else:
        value = n if n < 2 else fib(n - 1) + fib(n - 2)
        cache[n] = value
        return value
elisbyberi commented 1 year ago

@lewisfogden You don't need to repeatedly create the decorator function using the higher-order "cache" function. Instead, you can simply use the decorator function directly. Note that the "cache" function is not the actual decorator function. The decorator function is the one that you are returning from the "cache" function, such as the "inner" function.

Let's look at an example from PEP-0318:

def attrs(**kwds):
    def decorate(f):
        for k in kwds:
            setattr(f, k, kwds[k])
        return f

    return decorate

@attrs(versionadded="2.2", author="Guido van Rossum")
def mymethod():
    pass

print(mymethod.versionadded)
print(mymethod.author)

This is essentially equivalent and has the same performance overhead as:

def decorate(f):
    kwds = {"versionadded": "2.2", "author": "Guido van Rossum"}
    for k in kwds:
        setattr(f, k, kwds[k])
    return f

def mymethod():
    pass

mymethod = decorate(mymethod)

print(mymethod.versionadded)
print(mymethod.author)

Just to clarify, the "attrs" function is used to create the decorator for the "mymethod" function. The actual decorator function is the "decorate" function.

lewisfogden commented 1 year ago

Thanks, that is interesting, however I think we are going a bit off topic! The issue was caching/memoization 😃

elisbyberi commented 1 year ago

@lewisfogden Perhaps we have deviated from the main topic by studying Python decorators, but this is because Python does not provide built-in support for function caching or memoization. However, it is possible to implement caching functionality without using Python decorator syntax, just like the "functools" module's "cache" function creates a decorator to cache other functions.

In Codon, we are required to create decorators using static typing. The main objective is to create the decorator function, regardless of the programming language syntax used, as it will ultimately get compiled. This means that all of these syntaxes will be removed during the optimization process.

inumanag commented 1 year ago

I need to take a look at this one. We do not support Python's native stdlib caching (functools.cache) yet.