Closed gibbona1 closed 1 month ago
If your data is in geographic coordinate system, the s2 engine will be used for calculations on ellipsoid in {sf} ({terra} usually uses the GEOS engine for planar calculations), which may be the reason for the difference (check the sf::sf_use_s2(FALSE)
option for comparison of results).
If you want to speed up sf::st_intersection()
, maybe this thread will helpful: https://github.com/r-spatial/sf/issues/801
Thanks for the quick answer @kadyb! Seems to work well.
I had sf::sf_use_s2(TRUE)
by default in my shiny app so I'll switch it off for better comparison.
Apologies for reopening but there still seems to be an issue in some spatial datasets. switching s2 to off by default worked for intersecting one pair of datasets and giving very close results, but I have an example where the results are very different between sf
and terra
.
Data is available in an external repo related to my shiny app here.
I want to intersect the two datasets and have each row have a polygon with the columns CODE_12
, the code of the polygon from sf1
(in 2012), and CODE_18
, the same for sf2
(2018).
terra::intersect
does this operation considerably faster (2-20 times from what I've seen) than sf::st_intersection
and so I want to implement it as an option as I have heard from some users that speed is a drawback of the app for larger datasets.
Despite this, the results are extremely different. The intersection gives a different number of rows (which could be okay), but the CODE_12
column of df_int
only has two values, 111 and 112, whereas df_int2
is as expected. CODE_18
produces differences also, though not as drastic.
Is there any reason why this could be? Is it the data, my code, or terra
itself? @kadyb any intuition on what might be going on?
library(sf)
library(terra)
library(dplyr)
sf_folder <- "map_data/Dargle"
f1 <- "dargle_CLC2012.shp"
f2 <- "dargle_CLC2018.shp"
sf_use_s2(FALSE)
#read data
sf1 <- st_read(file.path(sf_folder, f1)) %>% st_make_valid()
sf2 <- st_read(file.path(sf_folder, f2)) %>% st_make_valid()
terra1 <- vect(sf1)
terra2 <- vect(sf2)
df_int <- terra::intersect(terra1, terra2) %>% st_as_sf()
df_int %>%
arrange(CODE_12) %>%
select(geometry, CODE_12) %>%
plot()
df_int2 <- st_intersection(sf1, sf2)
df_int2 %>%
arrange(CODE_12) %>%
select(geometry, CODE_12) %>%
plot()
using dplyr_1.1.4
, terra_1.7-55
, sf_1.0-14
.
df_int2
(using just sf
):
df_int
(using terra
and converting back to sf
):
Try updating {terra} to the latest version. Using your code in df_int$CODE_12
column from {terra} I have 21 categories, like in {sf}.
df_int %>%
arrange(CODE_12) %>%
select(geometry, CODE_12) %>%
plot()
unique(df_int$CODE_12)
#> [1] "111" "112" "121" "122" "131" "133" "141" "142" "211" "231" "242" "243" "311" "312"
#> [15] "313" "322" "324" "333" "412" "512" "523"
BTW: In this line: df_int2 <- st_intersection(df1, df2)
, objects should rather be sf1
and sf2
?
Yes apologies, updated df1
and df2
to sf1
and sf2
.
Can confirm that updateing to terra_1.7-78
and sf_1.0-17
has solved the issue
Thanks again!
If I have two sf objects (polygons and multipolygons),
df1
anddf2
, which cover the same area but at different times. I want to get the intersection of the two maps.I currently use
df_int <- st_intersection(df1, df2)
which is correct but the result is pretty slow for large sf objects. The below method is significanlty faster (10-20 times) but sometimes has different results. any idea why this might be the case? They have the same CRS.I've also tried
intersect(terra1, terra2)
which can be different again.How do I get the speed benefits of
terra
but have basically identical output to thesf
version?