Closed DXist closed 2 years ago
Hello! Were you aware of the optional rstar integration already built into geo-types? Or is this trying to achieve something else?
https://github.com/georust/geo/blob/master/geo-types/src/point.rs#L552
Hi @michaelkirk , I want to get nearest neighbours and their geo distances in meters for relatively close geo points. I am going to use these distances to obtain lower and upper bounds in Incremental Euclidean Restriction algorithm and then find nearest neighbours on road network.
So you are dealing with lat/lon points, but we want to find neighbors based on meters.
I'm not familiar enough to know if this is reasonable. I've only ever used an R-Tree with boring cartesian math. So I'll have to wait for someone more knowledgable to weigh in.
edit: specifically, the approach I'd personally take is, to have my application project my points to a cartesian space (maybe using proj), and then trust the rtree to work with boring normal math.
specifically, the approach I'd personally take is, to have my application project my points to a cartesian space (maybe using proj), and then trust the rtree to work with boring normal math.
I am not sure about the performance implications, but I think this might be preferable. Using PointWithData
, the original coordinates could be stored alongside the projected ones so that there are no roundtrip inaccuracies for the results of the neighbourhood query.
So you are dealing with lat/lon points, but we want to find neighbors based on meters.
I'm not familiar enough to know if this is reasonable. I've only ever used an R-Tree with boring cartesian math. So I'll have to wait for someone more knowledgable to weigh in.
edit: specifically, the approach I'd personally take is, to have my application project my points to a cartesian space (maybe using proj), and then trust the rtree to work with boring normal math.
Projections like Universal Mercator Projection work good for local data on city or small region level. This approach becomes inconvenient when geo coordinates are expected to be global. For UTM there are 60 zones in 2 hemispheres that require 120 local Rtrees to store global data. Zone partitioning introduces discontinuity and complicates processing geo data around the boundaries.
specifically, the approach I'd personally take is, to have my application project my points to a cartesian space (maybe using proj), and then trust the rtree to work with boring normal math.
I am not sure about the performance implications, but I think this might be preferable. Using
PointWithData
, the original coordinates could be stored alongside the projected ones so that there are no roundtrip inaccuracies for the results of the neighbourhood query.
I've benchmarked custom distance method using 10000 random points on city level and 100 nearest neighbour queries. Custom distance takes +18% to construct the tree. NN queries are about ~22 times slower.
It's possible to precompute latitude distance scaling coefficients with a ~0.1 degree step. This will speedup distance calculations at the expense of slightly less accurate results.
With precomputed coefficients NN queries with custom distance method are about ~18 times slower than original implementation.
I've decided to convert coordinates to 3D ECEF coordinate system and use euclidean distances. My benchmark shows that it's ~3 times faster to construct and ~21 times faster to query nearest neighbours than using custom distance metric with 2D lon/lat coordinates.
Thank you for your comments! I'm closing this issue.
[ ] I added an entry to
rstar/CHANGELOG.md
if knowledge of this change could be valuable to users.This PR is a demo of an attempt to use rstar to store geo points (with longitude and latitude as x, y values) and run nearest neighbour algorithm between close points.
To implement the demo in external crate I have exposed PointExp trait to public interface and disabled default PointExp implementation for Point types except arrays and tuples.
I added the demo to
rstar/src/geopoint.rs
.Custom
sub
method calculates approximate distance between two points on WGS84 ellipsoid model of the Earth(https://en.wikipedia.org/wiki/Earth_radius#Meridional), using the code from https://github.com/vipera/cheap-ruler-rs . This allows to calculate distances with acceptable precision up to ~500km which is good enough for my use case. The method returns deltas along longitudes and latitudes scaled to distances in meters.As I understand
PointExt
should be private. Is it better to introduce a public trait for coordinate metric distances and implementPointExt::sub
method via something likeCoordMetricDistance::delta_between_points
?