Closed edwardlavender closed 1 year ago
Hi,
Thank you for your nice feedback about cppRouting
.
Coordinates are used in two one-to-one
algorithms, available in get_*_pair()
functions : A and NBA (New Bidirectionnal A).
Coordinates are used to compute an heuristic fonction (euclidean distance) to decrease search space of Dijkstra algorithm.
So if cost
are correct geodesic distances between nodes, the shorest path between 2 nodes must be correct since algorithms minimize the sum of distances. To avoid errors, I would not use A or NBA since heuristic is expressed in Euclidean distance.
Let me kown if I don't understand correctly your question !
Thanks for your prompt response! I think you are saying that lon/lat coordinates can be used as long as the cost
is defined correctly as the geodesic distances between nodes? However, I have observed an issue with distances not correctly wrapping around the international date line (i.e., the distance between (-179, 0) and (179, 0) is much longer than expected). Perhaps I am misunderstanding something. Here is a reproducible example:
#### cppRouting version
packageVersion("cppRouting")
# > 3.1
#### Define example dataset
# * We want to calculate shortest distances from a point
# ... on a lon/lat RasterLayer to the cells of that RasterLayer.
# * Normally, I would do this if there are barriers in the landscape e.g., coastline,
# ... but for simplicity we will just ignore that here.
# * We will choose a point near the International Date Line
# ... to check distance calculations.
surface <- raster::raster(res = 5, crs = "+proj=longlat +ellps=WGS84 +datum=WGS84 +no_defs")
surface[] <- runif(raster::ncell(surface), 0, 1)
centroid <- cbind(-179, 0)
#### Calculate Euclidean distances from point (for comparison)
reuc <- raster::distanceFromPoints(surface, centroid)
reuc <- raster::mask(reuc, surface)
#### Calculate shortest distances from point via cppRouting
# Define point (node)
centroid_cell <- raster::cellFromXY(surface, centroid)
centroid_cpp <- sprintf("%.f", centroid_cell)
# Define cells on raster
cells <- raster::as.data.frame(surface)
cells <- data.table::data.table(cell = seq_len(nrow(cells)),
euc = reuc[],
value = cells[, 1])
cells[, cell_cpp := sprintf("%.f", cell)]
# Define distances between adjacent cells on raster
edges <- SpaDES.tools::adj(surface, cells = cells$cell, directions = 8)
edges <- data.table::as.data.table(edges)
edges[, cost := raster::pointDistance(raster::xyFromCell(surface, edges$from),
raster::xyFromCell(surface, edges$to),
lonlat = TRUE)]
edges[, from := sprintf("%.f", from)]
edges[, to := sprintf("%.f", to)]
# Make graph
cells[, x := raster::xFromCell(surface, cells$cell)]
cells[, y := raster::yFromCell(surface, cells$cell)]
coord <- data.frame(node = cells$cell_cpp, X = cells$x, Y = cells$y)
graph <- cppRouting::makegraph(df = edges, directed = TRUE, coords = coord)
# Compute shortest distances from centroid to all non NA cells
cells <- cells[!is.na(value), ]
clcp <- cppRouting::get_distance_matrix(
Graph = graph,
from = centroid_cpp,
to = cells$cell_cpp)
cells[, clcp := as.numeric(clcp)[match(cells$cell_cpp, dimnames(clcp)[[2]])]]
#### Calculate shortest distances via gDistance (for comparison)
# Define a transition matrix where the ease of movement between cells is equal
tr <- gdistance::transition(surface, function(...) 1, directions = 8, symm = TRUE)
# Use gdistance::geoCorrection to account for lon/lat coords
tr <- gdistance::geoCorrection(tr, type = "c")
cells$glcp <- gdistance::costDistance(x = tr, fromCoords = centroid,
toCoords = as.matrix(cells[, c("x", "y")]))[1, ]
# Compare cppRouting and gDistance outputs
all.equal(cells$clcp, cells$glcp)
#### Convert euclidean/shortest distances to rasters for plotting
reuc <- surface
reuc[cells$cell] <- cells$euc
rclcp <- surface
rclcp[cells$cell] <- cells$clcp
rglcp <- surface
rglcp[cells$cell] <- cells$glcp
#### Plot results
# Set plotting window
pp <- par(mfrow = c(2, 3), oma = c(2, 2, 2, 2), mar = c(2, 2, 2, 2))
# Plot lcp ~ euc (expect approx. straight line)
plot(cells$euc, cells$clcp); abline(0, 1, col = "red")
plot(cells$euc, cells$glcp); abline(0, 1, col = "red")
plot(0, type = "n", axes = FALSE, xlab = "", ylab = "")
# Plot Euclidean distances map
# ... This correctly shows the closeness between -179 and +179 (etc.)
raster::plot(reuc, main = "Euclidean")
points(centroid)
# Plot cppRouting shortest distances maps
# ... This does not seem to account for the international date line
# ... i.e., -179 and + 179 are far apart
raster::plot(rclcp, main = "cppRouting")
points(centroid)
# Plot gDistance shortest distances maps
raster::plot(rglcp, main = "gDistance")
points(centroid)
par(pp)
By the way, in the code above, I am using sprintf
before cppRouting
functions to define nodes because I have sometimes experienced errors with 'nodes not in graph' when numbers are passed to functions. I think this comes from the use of as.character()
in cppRouting
functions, such as makegraph()
(e.g., for a node that is 2e5, this gets converted to "2e+05" rather than "200000").
Well, it seems that there are some missing edges connecting -180 and 180 longitudes in cppRouting
network.
For example, node 1225 (-179, 0) should be connected to 1296 (179,0) since these cells are contiguous like in a cylinder, right ?
If I set centroid <- cbind(0, 0)
, results are equals.
For your information, get_distance_matrix
don't use coords
attribute.
Thank you for your feedback! It looks like there is an issue with how I prepared the adjacency matrix via adj()
in the above example then. To summarise, when it comes to cppRouting
:
get_*_pair()
functions, you can include lon/lat coordinates and the shortest distances will be calculated, if the cost
is defined as the geodesic distance correctly (and there is not a 'missing' edge at -180/+180). get_distance_matrix()
, coordinates are not used, so you can't use lon/lat data in those functions. Is that right?
Thanks again for your advice & for making this great package available.
If cost
are correct distance, every function will work.
Coordinates are only used in get_*_pair()
function with A*
and NBA
algorithms.
These algorithms will give you the same result but it is faster.
This package is brilliant & I use it all the time (mainly for calculating the shortest distances between locations). This is a feature request - it would be really useful if the
get_distance_*()
functions in the package could handle longitude and latitude coordinates. I appreciate that you can project lon/lat coordinates, but this can induce distortions which are undesirable & cause issues around the International Date Line. Ideally, it would be possible to supply a data frame with from/to/cost tomakegraph()
, wherecost
represents geodesic distances, with lon/lat coords supplied to thecoords
argument of the same function, and then the package (e.g.,get_distance_matrix()
) would correctly compute shortest distances between selected nodes. Is this possible?