holoviz / spatialpandas

Pandas extension arrays for spatial/geometric operations
BSD 2-Clause "Simplified" License
308 stars 24 forks source link

Inlining hilbertcurve dependency or getting it on conda #51

Open philippjfr opened 4 years ago

philippjfr commented 4 years ago

The spatial indexing in spatialpandas is using the hilbertcurve package. The implementation is a tiny package of about ~150 lines of code, which is currently MIT licensed. Since this is a well established implementation of a pretty standard algorithm I don't think we will gain a whole lot from depending directly on this package since there are unlikely to be bug fixes in the actual code. At the same time as long as we get the dependency on conda once we also don't need to worry much about having to wait for a release. On balance my vote is to simply inline the code and potentially optimize it with numba later if that's worthwhile. In the short term this will simplify my short term pain in setting up the build infrastructure. Overall I also think the conda ecosystem suffers from tiny packages so I'm not entirely enthusiastic about adding it conda-forge and potentially defaults (eventually).

Cc: @jonmmease @jbednar

jbednar commented 4 years ago

Inlining sounds like a good approach here.

jonmmease commented 4 years ago

For context, when I started this I used the hilbertcurve package (https://github.com/galtay/hilbertcurve) as a reference, but not a runtime dependency. I developed a numba implementation of the hilbert curve calculations in https://github.com/holoviz/spatialpandas/blob/master/spatialpandas/spatialindex/hilbert_curve.py, and this is what is used by the RTree implementation in https://github.com/holoviz/spatialpandas/blob/master/spatialpandas/spatialindex/rtree.py.

I did use the hilbertcurve package as a testing dependency to make sure my implementation produced identical results (See https://github.com/holoviz/spatialpandas/blob/master/tests/spatialindex/test_hilbert_curve.py).

I don't have a strong opinion on inlining vs depending, but wanted to point out that it's only used in testing.

jbednar commented 4 years ago

Sounds like a good reason to vendor it in; we don't want any changes to that library to break our tests; it's effectively just some reference code.

philippjfr commented 4 years ago

Thanks @jonmmease, totally missed that.

philippjfr commented 4 years ago

For now I'm happy to install it separately on CI.