Closed luukvdmeer closed 3 years ago
A reprex:
library(sfnetworks)
# x is a set of points.
# y is the edges element of the sfnetwork, containing lines.
st_snap_to_network = function(x, y) {
# Function to find the nearest network point to a single input point.
f = function(z) {
# Convert sfg to sfc.
z = sf::st_sfc(z, crs = sf::st_crs(x))
# Run st_nearest_points.
# This returns for each combination (z, y) a linestring from z to the
# .. nearest point on y.
np = sf::st_nearest_points(z, y)
# Then, the endpoint of the np-line to the nearest feature y from z
# .. is the nearest network point to z.
lwgeom::st_endpoint(np[sf::st_nearest_feature(z, y),])
}
# Run f for all input points.
geoms = do.call("c", lapply(sf::st_geometry(x), f))
# Replace the geometries of x with their nearest network points.
if (inherits(x, "sf")) sf::st_geometry(x) = geoms
if (inherits(x, "sfc")) x = geoms
x
}
# Some edges.
l = dplyr::slice(roxel, 1:3)
# A point not on the edges.
p = sf::st_centroid(sf::st_union(l))
#> Warning in st_centroid.sfc(sf::st_union(l)): st_centroid does not give correct
#> centroids for longitude/latitude data
# The nearest point to p located on the edges.
np = st_snap_to_network(p, l)
#> although coordinates are longitude/latitude, st_nearest_points assumes that they are planar
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
# Plot.
plot(sf::st_geometry(l))
plot(sf::st_nearest_points(l, p), col = "red", add = T)
#> although coordinates are longitude/latitude, st_nearest_points assumes that they are planar
plot(p, pch = 20, add = T)
plot(np, col = "red", pch = 20, add = T)
# Does np intersect l? Yes! That's good.
sf::st_intersects(np, l)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
#> 1: 2
# Now lets use a slightly moved p.
p2 = sf::st_sfc(
p[[1]] + sf::st_point(c(0, 0.0004)),
crs = sf::st_crs(p)
)
# The nearest point to p2 located on the edges.
np2 = st_snap_to_network(p2, l)
#> although coordinates are longitude/latitude, st_nearest_points assumes that they are planar
#> although coordinates are longitude/latitude, st_nearest_feature assumes that they are planar
# Plot. Looks good again.
plot(sf::st_geometry(l))
plot(sf::st_nearest_points(l, p2), col = "red", add = T)
#> although coordinates are longitude/latitude, st_nearest_points assumes that they are planar
plot(p2, pch = 20, add = T)
plot(np2, col = "red", pch = 20, add = T)
# However.. Does np2 intersects l? No.
sf::st_intersects(np2, l)
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
#> 1: (empty)
# It is located 0.000008383502 meters from the linestring.
sf::st_distance(np2, l[2,])
#> Units: [m]
#> [,1]
#> [1,] 8.383502e-06
Created on 2020-06-08 by the reprex package (v0.3.0)
(just as a remark regarding the long/lat warning: projecting the data - I tested EPSG 31466 does not solve it either)
Regarding the nearest point on the nearest edge. How should we name such a point when creating a function that finds it?
igraph
uses the vertex to refer to what sfnetworks
and tidygraph
call nodes).Any ideas @Robinlovelace @agila5 @loreabad6 (and others)?
Any ideas @Robinlovelace @agila5 @loreabad6 (and others)?
Maybe nearest_point_on_edge ?
Outcomes from a small hack with @luukvdmeer today:
When we slightly change the proposed function to:
st_snap_to_network = function(x, y) {
# Function to find the nearest network point to a single input point.
f = function(z) {
# Convert sfg to sfc.
z = sf::st_sfc(z, crs = sf::st_crs(x))
# Run st_nearest_points.
# This returns for each combination (z, y) a linestring from z to the
# .. nearest point on y.
np = sf::st_nearest_points(z, y)
# Then, we get the np-line to the nearest feature y from z
np[sf::st_nearest_feature(z, y),] ### LINE CHANGED HERE!
}
# Run f for all input points.
geoms = do.call("c", lapply(sf::st_geometry(x), f))
# Replace the geometries of x with their nearest network points.
if (inherits(x, "sf")) sf::st_geometry(x) = geoms
if (inherits(x, "sfc")) x = geoms
x
}
We noticed:
LINESTRING
from sf::st_nearest_points()
, is passed onto sf::st_intersects()
, it seems to always intersect with the nearest edge. LINESTRING
itself can be passed onto lwgeom::st_split()
and so the edge will be split by the end-point of this line. lwgeom::st_split()
does is move slightly the newly created LINESTRING
s to include this POINT
. Since lwgeom
and sf
seem to handle the precision problem in this way, we think we can then work with these functions to address this issue. In that way, we avoid adding extra precision bias by doing things like calculating an st_buffer
around the snapped point to then split the line.
Won't fix, because of the precision error problem. For any point p and network G, it is practically impossible to return the projected (i.e. snapped) point of p on G with the guarantee that it intersects with G. Even an intersection of a point with a line does not always intersect with the line itself. Against all geometric rules, I know, but see below:
library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
v1 = st_point(c(7.532442, 51.95422))
v2 = st_point(c(7.53236, 51.95377))
v3 = st_point(c(7.53209, 51.95328))
l1 = st_sfc(st_linestring(c(v1, v2, v3)), crs = 4326)
v4 = st_point(c(7.53209, 51.95345))
v5 = st_point(c(7.53245, 51.95345))
l2 = st_sfc(st_linestring(c(v4, v5)), crs = 4326)
st_intersects(st_intersection(l2, l1), l1, sparse = F)
#> although coordinates are longitude/latitude, st_intersection assumes that they are planar
#> although coordinates are longitude/latitude, st_intersects assumes that they are planar
#> [,1]
#> [1,] FALSE
# Just to check:
# It is not the planar assumption that causes this.
# Transformed to a projected CRS same issue exists.
l1 = st_transform(l1, 3035)
l2 = st_transform(l2, 3035)
st_intersects(st_intersection(l2, l1), l1, sparse = F)
#> [,1]
#> [1,] FALSE
# A plot of the situation.
plot(l1)
plot(l2, add = T)
plot(st_intersection(l2, l1), pch = 20, col = "red", add = T)
Created on 2020-11-05 by the reprex package (v0.3.0)
That makes the snapping function pretty useless I think. The way to go now would be to first blend the points as nodes into the network with st_blend
(see #27) and then find the nearest node. That is a clean workflow that makes sense conceptually also I think.
The st_blend
function handles the precision problem internally, as follows:
Situation: A point p and a network G. Goal: Find the projection q of p on G, such that we can subdivide G at q (i.e. use q to split the edge in G on which it is located).
Find the edge x in G that is closest to p, using st_nearest_feature
Draw the shortest possible connection c between p and x, using st_nearest_points
Now, there are three options:
The endpoint b of c intersects with x. Hence, q is equal to b.
Because of precision errors, the endpoint b of c is located a few nano-meters away from x, but at the opposite side of x than p. That is, b itself does not intersect with x, but c does. Hence, conceptually q is equal to the intersection of c and x. Because of precision errors, however, this intersection might not intersect with x. That is, it is impossible for us to return the true q. However, since c does intersect with x, we can now use c to subdivide G.
Because of precision errors, the endpoint b of c is located a few nano-meters away from x, but at the same side of x as p. That is, neither b nor c intersect with x. This is the problematic case. What st_blend
does is:
This should guarantee intersection between the extended c and x. Hence, conceptually, q is equal to that intersection. Because of precision errors, however, this intersection might not intersect with x. That is, it is impossible for us to return the true q. However, since our extended c does intersect with x, we can use now c to subdivide G.
The three options are plotted (very exaggerated) below:
library(sf)
#> Linking to GEOS 3.7.1, GDAL 2.2.2, PROJ 4.9.2
par(mar = c(1,1,1,1))
x = st_linestring(c(st_point(c(0,-0.5)), st_point(c(0,2.5))))
p1 = st_point(c(-1,2))
p2 = st_point(c(-1,1))
p3 = st_point(c(-1,0))
b1 = st_point(c(0,2))
b2 = st_point(c(0.2,1))
b3 = st_point(c(-0.2,0))
c1 = st_linestring(c(p1, b1))
c2 = st_linestring(c(p2, b2))
c3 = st_linestring(c(p3, b3))
c3_ext = sfnetworks:::extend_line(c3, 0.4)
plot(x)
plot(c1, col = "red", add = T)
plot(c2, col = "red", add = T)
plot(c3, col = "red", add = T)
plot(c3_ext, col = "red", lty = 2, add = T)
plot(p1, pch = 20, add = T)
plot(p2, pch = 20, add = T)
plot(p3, pch = 20, add = T)
plot(b1, pch = 20, col = "red", add = T)
plot(b2, pch = 20, col = "red", add = T)
plot(b3, pch = 20, col = "red", add = T)
plot(st_intersection(c1, x), pch = 4, add = T)
plot(st_intersection(c2, x), pch = 4, add = T)
plot(st_intersection(c3_ext, x), pch = 4, add = T)
Created on 2020-11-05 by the reprex package (v0.3.0)
Note that this does require to have the processes of projection (i.e. snapping of the point to the network) and subdivision (i.e. splitting the edge) inside the same function. That is what st_blend
is. We can not split it into two separate functions for the individual processes, because as said, it is sometimes impossible to return the true location of the projection q.
Unfortunately this whole precision error does influence performance. Because even when you set your tolerance to 0, i.e. you only want to blend points that are already located on edges, we have to internally change this 0 to a very small number (currently 1e-5) because points that are on an edge may as well not be evaluated as intersecting. Hence, we have to search for all points located within the (small) tolerance before proceeding. This is time consuming.
Note also that this is not an R specific error. I read in a recent blogpost that in Pyhton it is the same. See https://towardsdatascience.com/connecting-pois-to-a-road-network-358a81447944
What do you think @Robinlovelace @agila5 @loreabad6 Sounds like a reasonable hack?
Won't fix as explained above.
Please describe the problem. It would be nice to snap any given point to its nearest point on the network, no matter if that is an existing node or not. Then, that nearest network point can be added as a new node to the network, by splitting the edge it is located on (using for example
lwgeom::st_split
).In
sf
, it is already possible to find the nearest point on a feature to another given feature. A quick function I wrote to do this:However, there is a problem with this approach. The returned point does not always intersect with the network.. Sometimes it is located a few millimetres away from the edge. This has to do with floating point precision.
See: https://gis.stackexchange.com/questions/288988/nearest-point-does-not-intersect-line-in-sf-r-package And: https://github.com/r-spatial/sf/issues/790
This reply does not sound hopeful:
Describe how you currently solve the problem, or how attempted to solve it There is a recent blogpost addressing this problem, which uses a solution of drawing a small buffer around the returned nearest network point: https://ryanpeek.org/mapping-in-R-workshop/03_vig_snapping_points_to_line.html
Another option would be to include the returned nearest network point in the edge, by moving the edge slightly.
Describe the desired functionality of sfnetworks A function
snap_to_network
that replaces the geometry of the input point by the geometry of its nearest network point, that is guaranteed to intersect with the network. This can then also be used in shortest paths calculations between any set of given points.