S2Geometry geographical point index: preliminary results

I started working on some Python/NumPy wrappers for s2geometry indexes here: https://github.com/benbovy/pys2index

I did some preliminary tests and naive benchmarks (random points) against scikit-learn's BallTree index (with Haversine metric), and the results are very promising! It might be worth to also compare it against some fast KD-Tree implementations available in Python (with appropriate coordinate system transformation).

A nice thing with s2geometry is that we could leverage additional features like:

setting a maximum lookup distance to speed-up queries (related to #11 and #13).
range queries, i.e., select all points within a latitude and/or longitude range (which would work well with xarray.Dataset.sel(lat=slice(...), lon=slice(...))) ... or actually within a region of any geometry (polygon).

Build index benchmark: pys2index.S2PointIndex 3x faster than sklearn.neighbors.BallTree for 10M points.

Query index benchmark: pys2index.S2PointIndex 40x faster than sklearn.neighbors.BallTree for 100k points!

Notebook link: https://gist.github.com/benbovy/1f35c717b44791ca600655081b5b6fc3

xarray-contrib / xoak

S2Geometry geographical point index: preliminary results #17