hypertidy / geodist

Ultra lightweight, ultra fast calculation of geo distances
https://hypertidy.github.io/geodist/
Other
93 stars 7 forks source link

Sequential geodist produces different results to paired #19

Closed mem48 closed 5 years ago

mem48 commented 5 years ago

There seem to be different results produced when geodist is used in sequential mode or manually pairing the results

# Make example lon/lat points
mat <- matrix(c(-0.193011, 52.15549,
                -0.197722, 52.15395,
                -0.199949, 52.15527,
                -0.199533, 52.15762,
                -0.193205, 52.15757,
                -0.174015, 52.17529)
                , ncol = 2, byrow = TRUE)
colnames(mat) <- c("lon","lat")

# Use sequential mode
dist1 <- geodist::geodist(mat, sequential = TRUE)

# Manually make to and from objects
from <- mat[1:5,]
to <- mat[2:6,]
dist2 <- geodist::geodist(from, to, paired = TRUE)
identical(dist1, dist2) # FALSE
mpadge commented 5 years ago

Oh yeah, you are indeed right there. Seems like a pretty mission-critical bug I've buried in there ...

mem48 commented 5 years ago

It has been causing me a headache all afternoon.

mpadge commented 5 years ago

Even more interesting:

dist3 <- geodist::geodist(mat)
dist3 <- dist3 [which (row (dist3) == (col (dist3) + 1))] # off-diagonal
identical (dist3, dist1) # FALSE
identical (dist3, dist2) # TRUE

it's actually a bug in the sequential method!

mpadge commented 5 years ago
library(geodist)
mat <- matrix(c(-0.193011, 52.15549,
                -0.197722, 52.15395,
                -0.199949, 52.15527,
                -0.199533, 52.15762,
                -0.193205, 52.15757,
                -0.174015, 52.17529),
              ncol = 2, byrow = TRUE)
colnames(mat) <- c("lon","lat")
from <- mat[1:5,]
to <- mat[2:6,]

measures <- c ("haversine", "vincenty", "cheap", "geodesic")
res <- sapply (measures, function (m) {
    dist1 <- geodist::geodist(mat, sequential = TRUE, measure = m)
    dist2 <- geodist::geodist(from, to, paired = TRUE, measure = m)
    dist3 <- geodist::geodist(mat, measure = m)
    dist3 <- dist3 [which (row (dist3) == (col (dist3) + 1))] # off-diagonal
    c (identical(dist1, dist2), identical(dist1, dist3), identical(dist2, dist3))
              })
rownames (res) <- c ("seq-paired", "seq-default", "paired-default")
knitr::kable (res)
haversine vincenty cheap geodesic
seq-paired TRUE TRUE FALSE TRUE
seq-default TRUE TRUE FALSE TRUE
paired-default TRUE TRUE TRUE TRUE

Created on 2019-04-11 by the reprex package (v0.2.1)

It's just a bug with sequential calculation using cheap distances ... now to find what must be an easy fix ...