geographiclib / geographiclib-python

Python implementation of the geodesic routines
MIT License
36 stars 5 forks source link

Vectorization? #2

Closed samsonknight closed 2 years ago

samsonknight commented 2 years ago

Native support for vectorized array data would be appreciated. You can use np.vectorize but full native support would be ideal.

cffk commented 2 years ago

This has been on my to do list for a while. One hold-up is that I would be starting from scratch learning numpy; another is that the inverse geodesic routine has lots of branches. I would follow how I vectorized the Octave/MATLAB routines. These allow you to mix scalar and matrix arguments and the results are identical to a succession of scalar calls. To get the last property, it's necessary to prune the arrays of numbers during the main iteration loop depending on which ones have reached convergence. This is all routine stuff in Octave and I imagine similar techniques can be used with numpy. If you can point me to any resource where such numpy vectorization techniques are described that would be useful.

Even if I do tackle this it probably won't be for a year. I've got other stuff on my stack of tasks. So if someone else wants to take a crack at this, I'd be willing to help.

Finally, don't forget that pyproj offers you python access to the same geodesic routines compiled in C.

samsonknight commented 2 years ago

Somehow I totally missed that in pyproj. Looks like it has native support for input numpy arrays. That will meet my needs. Testing shows a major speed up compared to even using np.vectorize. Thanks for the tip!

cffk commented 2 years ago

OK, thanks for letting me know.

cffk commented 2 years ago

Closing for now

gwerbin commented 1 year ago

Some thoughts on this from a current (happy) user of Geographiclib:

Instead of reimplementing this entirely from scratch in Numpy (which might be tricky using its current "layered" architecture of Geodesic and GeodesicLine), it might make more sense to use Numba to vectorize this, and/or compile it with an ahead-of-time compiler like Mypyc, Nuitka, or Pythran.

The current implementation uses pure Python floats, so it should be relatively easy for one of those compilers to generate efficient code. Using pure Python floats also means that this package is readily compatible with alternative Python implementations like PyPy.

Numba is especially interesting here because it can automatically turn a "scalar" Python function into a proper Numpy "ufunc" using its @numba.vectorize decorator. Numba also has (currently experimental) support for compiling entire classes. However, some refactoring might be required in order to make this code amenable to vectorization.

Some other libraries (like Pyproj) implement an ad-hoc subset of Numpy's vectorization inside their C/C++ wrapper layer, but there doesn't seem to be much point of doing this in a library that doesn't use a C extension.