Open jmason42 opened 5 years ago
I have had some success with cProfile. It takes some work to sift through the results but it has a programmatic interface that makes this automatable once you have some patterns established. I'll give it a run today. Feel free to use your own approaches as well, profiling is more of an art than a science.
We can make a profiling
dir if there is code or other artifacts associated with profiling.
Just started a cython
branch with some explorations there, FYI. Getting some performance improvements, the conversion takes some work however. Compiling with cython -a arrow/arrow.pyx
generates an html file that describes how much python you have to use (currently in arrow/arrow.html
).
Excellent, it will be good to have a few implementations to compare. Frankly I may need to rewrite some things from the ground up to get Numba working; their support for NumPy is not complete.
Incidentally, an issue you will bump into with Cython (and was part of the reason why I was so compelled by Numba) is random number generation; Cython can't compile calls to numpy.random
functions. @1fish2 had a nice solution, which I think is in the old complexation code: generate large amounts of random numbers simultaneously, and re-generate those as needed (but not as often as every step).
It's hard to speed up code without proper profiling tools. Existing solutions that I'm aware of:
profile
/cProfile
modules. I find these to be cumbersome, particularly because they can track evaluation time for functions that aren't being called or defined in the relevant scope. Perhaps there is a better way to use these.kernprof.py
AKA the line profiler. Really terrific as far as isolating and inspecting one scope. Not great for routine profiling.time
via the command line, or other tools likepytest
. Only useful for isolated code, and subject to a lot of variability.timeit
module. Again, only useful for isolated code, but does a few things to temper out any evaluation-to-evaluation variability.Custom solutions I've used in the past:
time.time()
calls to collect the evaluation time associated with blocks of code. Quick and easy, but potentially ugly and non-specific.I'm inclined to go with the latter; it will require us to break off more pieces and test them independently, but I think that's healthy anyway.
As far as routine profiling goes - I think this is desirable. I find it to be a useful way to check the health of code, as well as a way to make sure that anticipated performance hits do indeed have the anticipated effect (sanity check). It's unclear to me where this ought to go. It can't really be a unit test, since we have no expected run time that is going to be consistent across hardware (and run time is variable regardless).