Open 1fish2 opened 6 years ago
Another candidate tool:
threading.Thread
s.pypy might actually support NumPy now. The NumPy 1.15 release notes say "Improved support for PyPy." Ditto for the 1.14 release notes.
If it's beyond the "work in progress" stage, then it might be worth trying. Also the Intel Distribution for Python, #36
Good news: Numba is very effective when used in Gillespie-like stochastic simulations:
from __future__ import absolute_import, division, print_function
import timeit
import numpy as np
from numba import jit
def gillespie():
t = 0
x1 = 1000
x2 = 0
# 1, 2: equilibrium
c1 = 10
c2 = 10
# 3: loss
c3 = 0.1
t_hist = [0]
x_hist = [(x1, x2)]
for iteration in xrange(100000): # both range and xrange work
a1 = c1 * x1
a2 = c2 * x2
a3 = c3 * x2
a = np.array([a1, a2, a3])
at = a.sum()
if at == 0:
break
dt = np.random.exponential(1/at)
t += dt
# choice = np.where(np.random.multinomial(1, a / at))[0][0]
# choice = np.where(np.cumsum(a) >= at * np.random.uniform(0, 1))[0][0]
# This gives the best performance - about 50% better than the other
# two, which is consistent with theoretical expectations
r = at * np.random.uniform(0, 1)
for choice, acs in enumerate(np.cumsum(a)):
if acs >= r:
break
else:
raise Exception("A reaction couldn't be selected - this should not happen.")
if choice == 0:
x1 -= 1
x2 += 1
elif choice == 1:
x1 += 1
x2 -= 1
elif choice == 2:
x2 -= 1
else:
raise Exception("A nonexistant reaction was selected - this should not happen.")
t_hist.append(t)
x_hist.append((x1, x2))
return np.array(t_hist), np.array(x_hist)
jit_gillespie = jit(
gillespie,
nopython = True, # little to no difference, but good for error checking?
# parallel = True, # no improvement, possibly worse
# fastmath = True # little to no improvement
)
NUMBER = 1
print(timeit.repeat(gillespie, number = NUMBER))
print(timeit.repeat(jit_gillespie, number = NUMBER))
'''
Output:
[0.9029200077056885, 0.8931000232696533, 0.8873131275177002]
[0.4484701156616211, 0.02795100212097168, 0.02762603759765625]
'''
That's roughly a 30x speed-up for subsequent calls, and even the initial call is ~2x as fast as the uncompiled call. Unsure how this compares with Cython.
Caveat: I'm told that you can call numpy.random.seed
within the JIT'd code to seed the random number generation, but seeding outside doesn't work, and neither does using a numpy.random.RandomState
object.
Note: The numba.jit
function can be used like a decorator.
Little update on seeding:
def gillespie(seed = None):
if seed is not None:
np.random.seed(seed)
else:
# Without this line, Numba gets angry and fails. I don't exactly know
# why, other than that it's having a hard time inferring types, which
# providing an explicit signature *should* help (but does not, in fact,
# appear to make a difference).
seed = 0
...
I think this is a Numba bug, or at least, an unhandled use case (although it seems like an obvious use case to me). I verified that you can't seed outside the function, nor can you pass a numpy.random.RandomState
instance.
Further updates:
The code I posted above isn't necessary if you tell Numba about the types of locals:
def gillespie(seed = None):
if seed is not None:
np.random.seed(seed)
...
jit_gillespie = nb.jit(
gillespie,
locals = dict(
# This conveys that 'seed' is either None or an
# integer (specifically, an unsigned 32-bit integer)
seed = nb.optional(nb.uint32)
),
nopython = True
)
Also, I tried pre-generating the random numbers outside of the loop (as we now do in the complexation code), and found no perceivable improvement in evaluation time.
Here are the most interesting of the faster-Python implementations. Caveat emptor. Obviously our requirements include a robust implementation with NumPy and Linux compatibility.
Numba JIT-compiles math-heavy Python code to optimized CPU or GPU machine code at import time, runtime, or statically. Is it useful when most of the math is done in NumPy?
NumExpr is a fast numeric array-wise expression evaluator for Python, NumPy, PyTables, pandas, bcolz, and more. Compared to NumPy, it can get better cache utilization, reduces memory access, and can use all your CPU cores. NumExpr performs best on matrices that are too large to fit in L1 CPU cache.
Theano is a Python library to efficiently evaluate mathematical expressions involving multi-dimensional arrays. We're currently using Theano 0.7.0. The current release is 0.9.0. Release 1.0 will be the last main release.
Nuitka Python compiler.
Pyjion JIT API for CPython. (The README gives a quick comparison to various other tools like PyPy, Pyston, Numba, and Nuitka.)
Also see: