OceanParcels / Parcels

Main code for Parcels (Probably A Really Computationally Efficient Lagrangian Simulator)
https://www.oceanparcels.org
MIT License
294 stars 135 forks source link

Two libraries used for KDTree in Parcels. Use only one? #1616

Closed erikvansebille closed 2 months ago

erikvansebille commented 3 months ago

Throughout the codebase, Parcels uses two different libraries for kdtree:

https://github.com/OceanParcels/parcels/blob/035159a1b4002bc1550cebd275f93c0adaa8513d/parcels/particleset.py#L18-L21

https://github.com/OceanParcels/parcels/blob/035159a1b4002bc1550cebd275f93c0adaa8513d/parcels/interaction/neighborsearch/kdtreeflat.py#L2

Perhaps we should standardise this and use only one of them? Since scipy is already a dependency, perhaps better to use scipy.spatial.KDTree throughout?

VeckoTheGecko commented 3 months ago

Note that tutorial_nemo_curvilinear.ipynb should be updated to remove the comment "Note that this only works if you have the pykdtree package installed, which is only included in the Parcels dependencies in version >= 2.2.2"

VeckoTheGecko commented 3 months ago

Looking into this a bit more, pykdtree is still an active project. I think this is due to their benchmarking being better than the SciPy implemnetation, though I doubt these benchmarks are accurate since they're 12 years old (and I imagine scipy's performance may have significantly improved).

Let's go ahead with the removal, and if updated benchmarks show pykdtree is significantly better we can consider re-introducing (especially since the part of the code calling it looks to be part of the chunking algorithm).

Related: https://github.com/storpipfugl/pykdtree/issues/121