fumitoh / modelx

Use Python like a spreadsheet!
https://modelx.io
GNU Lesser General Public License v3.0
89 stars 20 forks source link

Numba/jit support #120

Closed alexeybaran closed 2 months ago

alexeybaran commented 2 months ago

It seems like modelx doesn't support Numba/jit. 1) I created function using jit in a separate module 2) Imported the module to modelx model using new_module function 3) The function works, but it shows up as regular function rather than CPUDispatcher object from numba.

fumitoh commented 2 months ago

I've always wanted to test modelx with Numba but haven't gotten around to it due to time constraints. Do you have a sample script that can reproduce the issue?

By the way, the recent DataFrame.apply now supports the engine="numba" option, which might also work in modelx formulas. You can find more information in this article: Unlocking C-level Performance in DataFrame.apply.

AB-Athene commented 2 months ago

The sample code below actually works. Sorry for confusion.

import modelx as mx
from time import time
m,s=mx.new_model(), mx.new_space()
m.new_module('fin', 'fin.py', 'fin.py')
m.new_module('fin_jit', 'fin_jit.py', 'fin_jit.py')
@mx.defcells
def a():
    import numpy as np
    for t in range(10000):
        fin.irr(np.ones(1000), 500)
    return 0

@mx.defcells
def a_jit():
    import numpy as np
    for t in range(10000):
        return fin_jit.irr(np.ones(1000), 500)
    return 0

t0 = time()
a()
t1 = time()
a_jit()

t2 = time()
print('no jit:', t1-t0,'jit: ',t2-t1)

fin.zip

alexeybaran commented 2 months ago

The tradeoff seems to be the time it takes to compile jit function on demand. It takes around 1 second for the function above.

fumitoh commented 2 months ago

Thanks, stil a_jit is considerable faster than a: no jit: 4.631266117095947 jit: 0.7599091529846191

AB-Athene commented 2 months ago

The relationship reverses, if the function is called 1000 times instead of 10000: no jit: 0.672553539276123 jit: 1.3341047763824463

fumitoh commented 2 months ago

Yep. In other words, jit takes no time other than compiling.

Size No JIT JIT
10000 4.631266117095947 0.7599091529846191
1000 0.4715766906738281 0.7797486782073975
100 0.046034812927246094 0.7822494506835938
AB-Athene commented 2 months ago

It seems that it is possible to avoid every time compilation by caching. I'm not sure, caching will work, if I want to load a few modelxmodels with potentially different functions with the same name.

@jit(cache=True)
def df_simple(rate, shape):
    return (np.ones(shape) * (1.0 + rate) ** -1).cumprod()

@jit(cache=True)
def irr(cf, target, guess=0.01, tolerance=1e-10, max_iter=100):
    _shock = 0.0001
    n = cf.shape
fumitoh commented 2 months ago

A new module is created every time new_module is called. In the case above, fin_jit is created per model if you create multiple models, i.e. the compilation time would increase with the number of models.