Open amandasaurus opened 1 year ago
I've wondered about that wording myself, but I think it's misleadingly restrictive based on the trait definition (https://docs.rs/rstar/0.11.0/rstar/trait.PointDistance.html). @adamreichold I know you've got some experience with distances other than Euclidean – can you help to clarify this?
@amandasaurus Here's what I was thinking of: https://github.com/georust/rstar/issues/48#issuecomment-679889333
You should be able to plug in https://docs.rs/geo/0.26.0/geo/algorithm/geodesic_distance/trait.GeodesicDistance.html squared and test it fairly easily??
I think the reference to "Euclidean" in the docs is misleading. Any distance metric fulfilling the usual axioms should do. (Out of the top of my head, I also don't see that we need anything beyond that, e.g. the metric does not need to be implied by a vector norm or a scalar product.)
(Note though that in https://github.com/georust/rstar/issues/48#issuecomment-679889333, I was indeed using the normal Euclidean distance in two-dimensional real space, I just changed the coordinates to polar coordinates and implemented the trait directly in terms of that. But the values were the same as if I would have converted my polar coordinates to Cartesian coordinates first and then used the provided implementation.)
Concerning adding points twice: This is one relatively simple way to handle the periodic "boundary conditions" of the angular coordinates, but there are others if this feels contrived to you. The paper https://arxiv.org/pdf/1712.02977.pdf discusses common approaches and proposes one especially efficient way of doing it.
On Tue, 26 Sep 2023 20:15 +02:00, Adam Reichold @.***> wrote:
Concerning adding points twice: This is one relatively simple way to handle the periodic "boundary conditions" of the angular coordinates, but there are others if this feels contrived to you. The paper https://arxiv.org/pdf/1712.02977.pdf discusses common approaches and proposes one especially efficient approach.
OK, it's always good to have fast code. However I'm mostly concerned that I don't understand enough to do it right 🤣. To ELI5, if I add point P with ($latitude1, $longitude1)
, then I must also add ($longitude1 + 180°, $latitude)
.
How do I do it with other geo types, like LineString
s or MultiLineString
s?
-- Amanda
if I add point P with
($latitude1, $longitude1)
, then I must also add($longitude1 + 180°, $latitude)
.How do I do it with other geo types, like
LineString
s orMultiLineString
s?
I would say that this is just a translation, i.e. you add a copy of the geometry with all its constituent points translated in the same manner.
The trick from the paper basically is re-parametrize you geometry to have fewer absolute and more relative and hence translation-invariant parameters so there is less work to do. But if all of your geometry is basically a bag of points, then just translating all of them should do the trick. (Of course, this then means negative angles and multiple overlap from a single piece of geometry and its mirrors, i.e. it can be inefficient, but it should still work.)
I'm running into a similar use case. My "hypothetical" approach is to use mercator projections to translate between planar and longitude/latitude points. Is this the recommended approach?
This paper is the clearest explanation of how to implement Lon / Lat indexing in an R{*}-tree that I've been able to find. I'm sure it's available on Sci-Hub in breach of copyright if you were to check the DOI (10.1007/978-3-642-40235-7_9)
If you're not using geo-types
you have the option of using either the "transform to 3D ECEF" approach, or if you are / would prefer it, the more elegant "compute MBR for geodetic coordinates" approach. Pseudocode for the distance function for option 2 is given in Algorithm 1
(p156) and an optimised version in Algorithm 2
(p159). I'm reproducing them below:
Data: c circumference of earth (spherical model)
Data: (φq,λq) query point
Data: (φl, λl, φh, λh) index MBR
if φl ≤ φq ≤φl then
if λq ≤ λl then return c·(λl − λq) / 360; // South of MBR
if λq ≥ λt then return c·(λq − λt)/ 360; // North of MBR
return 0; // Inside MBR
else if mod360(φl − φq) ≤ mod360(φq − φh) then // West of MBR
θh ← Azimuth(φl, λh, φq, λq);
if θh ≥ 270 then // North-West
return Great-Circle-Distance(φl, λh, φq, λq)
end
θl ← Azimuth(φl, λl, φq, λq);
if θl ≤ 270 then // South-West
return Great-Circle-Distance(φl, λl, φq, λq)
end
return |Cross-Track-Distance(φl, λl, φl, λh, φq, λq)|; // West
else // East of MBR
θh ← Azimuth(φh, λh, φq, λq);
if θh ≤ 90 then // North-East
return Great-Circle-Distance(φh, λh, φq, λq)
end
θl ← Azimuth(φh, λl, φq, λq);
if θl ≥ 90 then // South-East
return Great-Circle-Distance(φh, λl, φq, λq)
end
return |Cross-Track-Distance(φh, λh, φh, λl, φq, λq)|; // East
end
Data: c circumference of earth (spherical model)
Data: (φq, λq) query point
Data: (φl, λl, φh, λh) index MBR
if φl ≤ φq ≤ φl then
if λq ≤ λl then return c·(λl − λq) / 360; // South of MBR
if λq ≥ λt then return c·(λq − λt) / 360; // North of MBR
return 0; // Inside MBR
else if mod360(φl − φq) ≤ mod360(φq − φh) then // West of MBR
τ ← tan360(λq);
if mod360(φl − φq) ≥ 90 then // Large Δφ
if τ ≤ tan360((λl + λh) / 2) cos360(φl − φq) then
return Great-Circle-Distance(φl, λh, φq, λq); // North-West
else
return Great-Circle-Distance(φl, λl, φq, λq); // South-West
end
end
if τ ≥ tan360 (λh ) cos360 (φl − φq ) then // North-West
return Great-Circle-Distance(φl, λh, φq, λq); // South-West
if τ ≤ tan360 (λl) cos360 (φl − φq ) then
return Great-Circle-Distance(φl, λl, φq, λq);
return |Cross-Track-Distance(φl, λl, φl, λh, φq, λq)|; // West
else // East of MBR
τ ← tan360(λq);
if mod360(φq − φh) ≥ 90 then // Large Δφ
if τ ≤ tan360((λl + λh) / 2) cos360(φh − φq) then
return Great-Circle-Distance(φh, λh, φq, λq); // North-East
else
return Great-Circle-Distance(φh, λl, φq, λq); // South-East
end
end
if τ ≥ tan360 (λh) cos360 (φh − φq ) then
return Great-Circle-Distance(φh, λh, φq, λq); // North-East
if τ ≤ tan360 (λl ) cos360 (φh − φq ) then
return Great-Circle-Distance(φh, λl, φq, λq); // South-East
return |Cross-Track-Distance(φh, λh, φh, λl, φq, λq)|; // East
end
Some things worth noting:
geo
, you'll have to implement the Great-Circle distance and Cross-Track distance algorithms yourself. Feel free to use our implementations as a starting point – they tend to come with some baggage in terms of supporting functions – good luck!Cross-Track Distance
in the above algorithms accepts six parameters. I assume they relate to point, line_point_a, line_point_b
, whereas our implementation is implemented on Point
directly, so only requires the line_point_a
and line_point_b
parameters.Haversine for
Great-Circle-Distance`.To expand on this a little: you would use Algorithm{1, 2} as the basis of an rstar::Envelope
impl: the algorithms above return the minimum distance between a bounding box and a query point, so the trait methods would leverage the envelope
method of {Coord
, Line
, LineString
, Polygon
} to calculate distance_2
, contains_point
(distance_2
<= 0), and contains_envelope
. The query point's distance_2
impl would have to match, so it would also have to be Haversine
.
Hello @urschrei! I've been working on a solution for indexing on a spherical surface, inspired by the paper mentioned above. Would you be interested in extending the rstar
crate to support spherical envelopes, alongside the existing AABB
? If this aligns with the project's direction, I'd be happy to prepare a draft PR for further discussion.
I have a millions of points (in latitude & longitude). I want to calculate nearest neighbour queries using great circle distance, i.e. distance in metres along the surface of the earth. Can I use rstar for this?
rstar::PointDistance::distance_2
refers to “squared euclidean distance”, which implies not. I'm unsure if #70 answers this question... I'm OK with adding points twice, but I fear I don't fully understand how to do this right...