Closed kevindlewis23 closed 1 month ago
I'm getting similar results with installing hexrd and running fit-grains from the command line.
With some more testing, it seems that the C++ implementation is faster when just inputting one point, but python implementation does much better when more points are passed in. They run in about the same amount of time on 25 inputs, and for any more, the python implementation is faster.
Actually, after some checking, both implementations only use one core of CPU and no GPU, so it's not parallelization that's causing this. My guess is that Numpy's implementations are just better than xtensor's, so while xtensor is faster for small inputs because C++ is faster than Python, when the input is large enough, most of the time is taken up with numpy/xtensor doing vector operations, and numpy is just faster.
There's a
ge_41rt_inverse_distortion
function written in c++ inhexrd/transforms/cpp_sublibrary/src/inverse_distortion
, and this is used inhexrd/distortion/ge_41rt.py
instead of the method_ge_41rt_inverse_distortion
. Presumably this is for speed reasons, but testing shows that it actually adds 3 seconds to the runtime of fitgrains (running test_fit-grains.py goes from ~6 seconds to ~9 seconds on runtime).I assumed this was just compilation time, but when I change the test to run fitgrains 3 times instead of one, the runtime triples, so either it's not compilation time or it is recompiling for some reason. For the actual hexrd package, does it actually become faster, or is there testing of that?