Closed TheBB closed 6 years ago
Now that's what I'm talking about! 🎉 🎈 😀
Great work!
I was trying to run this using python2, but ran into fatal error: numpy/arrayobject.h not found
thread on the subject. Probably related to the fact that I have both numpy for python3 and python2 installed (at different versions), and I fear that the solution proposed in that thread might mess up my python3 setup.
When running using py3, then I observe the same 100x speedup as you report.
Need to add include_dirs=[numpy.get_include()]
to setup.py
as outlined in this StackOverflow thread, then it works for both py2 and py3. Can you add this and I'll merge it
Damn Python 2… okay, updated.
First, give types of all input arguments. This probably doesn't matter much for speed since they must be checked anyway. I used
np.float_t
for all doubles. This type is usually anyway the same asdouble
, but it might help out on platforms where that't not the case, if any. After everything was typed as a memoryview, I got an 8x speedup (4x before this PR). Implementing bisection in C brought that up to 15x, and doing the same withsnap()
landed us on a not inconsiderable 100x performance improvement!