r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.3k stars 289 forks source link

point created by intersecting two linestrings does not intersect them #2410

Open DOSull opened 4 days ago

DOSull commented 4 days ago

Describe the bug A point I generate by intersecting two linestrings fails expected st_intersects and does not yield expected st_relate pattern.

To Reproduce

sd <- 0.01
lines <- st_multilinestring(list(matrix(c(-1, 1, -1, 1) + 
                                           rnorm(4, 0, sd), 2, 2), 
                                  matrix(c(-1, 1, 1, -1) + 
                                           rnorm(4, 0, sd), 2, 2))) |> 
  st_sfc() |> st_cast("LINESTRING")
pt <- st_intersection(lines[1], lines[2]) |> st_sfc()

st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
 1: (empty)

st_relate(pt, lines)
     [,1]        [,2]       
[1,] "FF0FFF102" "FF0FFF102"

So the point resulting from intersecting two lines does not intersect them, and is not related to them as expected ("0FFFFF102"). Inspection of the objects shows that the point is at the intersection of the two lines close to (0, 0).

> lines
Geometry set for 2 features 
Geometry type: LINESTRING
Dimension:     XY
Bounding box:  xmin: -0.9997257 ymin: -1.020536 xmax: 1.012496 ymax: 0.9998001
CRS:           NA
LINESTRING (-0.9929678 -0.9947806, 1.012496 0.9...
LINESTRING (-0.9997257 0.9998001, 0.9876878 -1....
> pt
Geometry set for 1 feature 
Geometry type: POINT
Dimension:     XY
Bounding box:  xmin: 0.00154847 ymin: -0.01806063 xmax: 0.00154847 ymax: -0.01806063
CRS:           NA
POINT (0.00154847 -0.01806063)

Additional context If I make lines running from (-1 -1) to (1 1) and (-1 1) to (1 -1) exactly (by setting sd <- 0) then they intersect at (0 0) and that point satisfies expected spatial predicates against the generating intersecting lines, i.e. the st_relate pattern is "0FFFFF102".

I have experimented with setting precision on the geometries but have only obtained expected results reliably when precision is set very crudely. If the problem is deemed 'upstream' in GEOS, as I suspect it might be (geos::geos_relate produces the same results on these geometries as sf) what might be a viable workaround in sf?

I initially encountered this problem working with lwgeom::st_split() trying to split lines by their intersection points, and getting many fewer output linestrings than expected. On looking more closely this was what I ran into.

> sessioninfo::session_info() ─ Session info ────────────────────────────────────────────────────────── setting value version R version 4.3.3 (2024-02-29) os macOS Sonoma 14.5 system aarch64, darwin20 ui RStudio language (EN) collate en_US.UTF-8 ctype en_US.UTF-8 tz Pacific/Auckland date 2024-07-01 rstudio 2024.04.1+748 Chocolate Cosmos (desktop) pandoc NA ─ Packages ─────────────────────────────────────────────────────────── package * version date (UTC) lib source class 7.3-22 2023-05-03 [1] CRAN (R 4.3.3) classInt 0.4-10 2023-09-05 [1] CRAN (R 4.3.0) cli 3.6.2 2023-12-11 [1] CRAN (R 4.3.1) codetools 0.2-19 2023-02-01 [1] CRAN (R 4.3.3) DBI 1.2.2 2024-02-16 [1] CRAN (R 4.3.1) e1071 1.7-14 2023-12-06 [1] CRAN (R 4.3.1) KernSmooth 2.23-22 2023-07-10 [1] CRAN (R 4.3.3) lattice 0.22-5 2023-10-24 [1] CRAN (R 4.3.3) lwgeom * 0.2-14 2024-02-21 [1] CRAN (R 4.3.1) magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.3.0) Matrix 1.6-5 2024-01-11 [1] CRAN (R 4.3.3) proxy 0.4-27 2022-06-09 [1] CRAN (R 4.3.0) raster 3.6-26 2023-10-14 [1] CRAN (R 4.3.1) Rcpp 1.0.12 2024-01-09 [1] CRAN (R 4.3.1) rstudioapi 0.15.0 2023-07-07 [1] CRAN (R 4.3.0) sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.3.0) sf * 1.0-16 2024-03-24 [1] CRAN (R 4.3.1) simplextree 1.0.1 2020-09-12 [1] CRAN (R 4.3.0) sp 2.1-4 2024-04-30 [1] CRAN (R 4.3.1) terra 1.7-71 2024-01-31 [1] CRAN (R 4.3.1) units 0.8-5 2023-11-28 [1] CRAN (R 4.3.1) [1] /Library/Frameworks/R.framework/Versions/4.3-arm64/Resources/library ───────────────────────────────────────────────────────────────────────────────────────────────────────────────── > sf_extSoftVersion() GEOS GDAL proj.4 GDAL_with_GEOS USE_PROJ_H PROJ "3.11.0" "3.5.3" "9.1.0" "true" "true" "9.1.0"
rsbivand commented 4 days ago

Please add set.seed to any example usong the RNG. What precision settings do you prefer to apply?

DOSull commented 4 days ago

Point taken about set.seed - although in this case, part of the point is the variability. When I set precision I usually go with 1e8 as that's a number that I think you've said in the past has some previous currency! However, in this example I have to set it much more crudely than that st_set_precision(1)! for the outcome to be consistently what I would hope/expect.

I know that line intersection is notoriously challenging, but that seems rather extreme!

rsbivand commented 4 days ago

1e8 in this context is the inverse, so:

> formatC(1/1e8, format="f", digits=8)
[1] "0.00000001"

and so on. Precision means converting the coordinates from double precision to an integer grid by multiplying by say 1e8 and checking for integer equality on that grid. sf by default uses floating point double precision rather than the integer grid (which was the default in defunct rgeos). So any jitter greater than the chosen precision will lead to the point being seen as not equal.

> set.seed(1)
> sd <- 0.01
> lines <- st_multilinestring(list(matrix(c(-1, 1, -1, 1) + 
+                                            rnorm(4, 0, sd), 2, 2), 
+                                   matrix(c(-1, 1, 1, -1) + 
+                                            rnorm(4, 0, sd), 2, 2))) |> 
+   st_sfc() |> st_cast("LINESTRING")
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> 
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate
was `intersects'
 1: (empty)
> st_precision(lines) <- 1e8
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate
was `intersects'
 1: (empty)
> st_precision(lines) <- 1e2
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate
was `intersects'
 1: 1, 2
DOSull commented 3 days ago

I know that precision is set in the opposite sense to how one might expect i.e., using a 'scale' multiplier, per the JTS documentation at https://locationtech.github.io/jts/javadoc/org/locationtech/jts/geom/PrecisionModel.html linked from the st_as_binary help. I made that mistake a while back in a different issue #1855!

But the above is not what I get...

> set.seed(1)
> sd <- 0.1
> lines <- st_multilinestring(list(matrix(c(-1, 1, -1, 1) + 
+                                            rnorm(4, 0, sd), 2, 2), 
+                                   matrix(c(-1, 1, 1, -1) + 
+                                            rnorm(4, 0, sd), 2, 2))) |> 
+   st_sfc() |> st_cast("LINESTRING")
> 
> st_precision(lines) <- 1e8
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
 1: (empty)
> 
> st_precision(lines) <- 1e2
> pt <- st_intersection(lines[1], lines[2]) 
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
 1: (empty)

In any case... I'm not jittering the point, I'm changing the ends of the lines, and they are intersecting to yield the point, which by definition ought therefore to intersect the lines that generated it? It also seems wrong that to get this result we need a grid as crude as 0.01 resolution (and on my machine I have to go to 0.1, i.e. st_precision(...) <- 10, and even then I frequently get 'misses'!

> set.seed(2)
> sd <- 0.1
> lines <- st_multilinestring(list(matrix(c(-1, 1, -1, 1) +
+                                            rnorm(4, 0, sd), 2, 2), 
+                                   matrix(c(-1, 1, 1, -1) +
+                                            rnorm(4, 0, sd), 2, 2))) |> 
+   st_sfc() |> st_cast("LINESTRING")
> 
> st_precision(lines) <- 1e8
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
 1: (empty)
> 
> st_precision(lines) <- 1e1
> pt <- st_intersection(lines[1], lines[2]) 
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate was `intersects'
 1: (empty)

Also worth noting how apparent the poor resolution is when these objects are plotted. The plot below is for the objects generated with set.seed(2) and st_precision(lines) <- 1e1 on my machine:

> plot(lines)
> plot(pt, add = TRUE)

image

For what it's worth geos::geos_intersects yields the same incorrect result on my machine.

> geos::geos_intersects(pt, lines)
[1] FALSE FALSE

Is there some way in which this could be machine-dependent? MacOS ARM architecture here.

rsbivand commented 3 days ago

How was sf installed, CRAN binary or otherwise?

rsbivand commented 3 days ago

Trying on an aarch64 with R 4.4.1 and sf 1.0-16, I get your results, so a double-precision 64-bit/extended precision 80-bit difference most likely. My GEOS on Intel (Fedora 40) is newer too, but not much. What you are "jittering" are the endpoints of the lines, so indeed something is going on here. @paleolimbot any thoughts?

DOSull commented 3 days ago

I would have installed from CRAN. I'm glad to hear that I am probably not missing something obvious!

DOSull commented 3 days ago

Also... this:

> st_distance(lines, pt)
            [,1]
[1,] 8.21824e-17
[2,] 0.00000e+00

where I get some variation on 0 running many times with different random line end offsets, but the intersections still come up empty.

rsbivand commented 20 hours ago

It looks as though the lines would need to be noded at the intersection:

> set.seed(1)
> lines <- st_multilinestring(list(matrix(c(-1, 1, -1, 1) + 
+                                            rnorm(4, 0, sd), 2, 2), 
+                                   matrix(c(-1, 1, 1, -1) + 
+                                            rnorm(4, 0, sd), 2, 2))) |> 
+   st_sfc() |> st_cast("LINESTRING")
> pt <- st_intersection(lines[1], lines[2]) |> st_sfc()
> st_intersects(pt, lines)
Sparse geometry binary predicate list of length 1, where the predicate
was `intersects'
 1: (empty)
> st_intersects(pt, st_union(lines))
Sparse geometry binary predicate list of length 1, where the predicate
was `intersects'
 1: 1
> st_coordinates(lines)
              X          Y L1
[1,] -1.0062645 -1.0083563  1
[2,]  1.0018364  1.0159528  1
[3,] -0.9967049  1.0048743  2
[4,]  0.9917953 -0.9926168  2
> st_coordinates(st_union(lines))
                X            Y L1 L2
[1,] -1.006264538 -1.008356286  1  1
[2,] -0.001176252  0.004844437  1  1
[3,] -0.001176252  0.004844437  2  1
[4,]  1.001836433  1.015952808  2  1
[5,] -0.996704922  1.004874291  3  1
[6,] -0.001176252  0.004844437  3  1
[7,] -0.001176252  0.004844437  4  1
[8,]  0.991795316 -0.992616753  4  1
> st_coordinates(pt)
                X           Y
[1,] -0.001176252 0.004844437
> st_length(st_nearest_points(pt, lines))
[1] 1.612109e-16 6.930893e-17
> st_length(st_nearest_points(pt, st_union(lines)))
[1] 0

See also https://github.com/libgeos/geos/issues/585:

> st_is_within_distance(pt, lines, .Machine$double.eps)
Sparse geometry binary predicate list of length 1, where the predicate
was `is_within_distance'
 1: 1, 2
DOSull commented 14 hours ago

So this is obviously the 'feature-not-a-bug' that prevents me using lwgeom::st_split to successfully break linestrings at all their points of intersection. I've written code that works around it by inserting all the intersection points into the linestrings but it's not practical because it's slow! Frustrating!