r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.34k stars 298 forks source link

Why does st_join fail for some point-in-polygon operations? #2408

Closed Nova-Scotia closed 4 months ago

Nova-Scotia commented 4 months ago

I'm puzzled as to why st_join is not working for some point-in-polygon operations. I'm using a province-specific projection to determine what snow measurement stations (point feature) fall in which districts (polygon feature). I noticed that none of the snow stations in Dryden district are showing up as present in that district.

Here's a reproducible example. I downloaded the original spatial data from ArcGIS Online (snow locations: https://ws.lioservices.lrc.gov.on.ca/arcgis2/rest/services/LIO_OPEN_DATA/LIO_Open08/MapServer/31; districts: https://services1.arcgis.com/TJH5KDher0W13Kgo/arcgis/rest/services/MNR_Administrative_Boundaries/FeatureServer/1) and brought them into R with the arcgisbinding package. I transformed them to sf compatible objects using arc.data2sf(). Then I transformed them to a common projection for Ontario (Lambert Conformal Conic) using st_transform(3162).

I filtered the datasets to three features each to keep it simple, and attached them as text files to this issue.

I'm using R V. 4.3.2 and sf_1.0-16 on a Windows 10 computer.

snowlocs.txt districts_old.txt

library(tidyverse)
library(sf)

snowlocs <- dget("snowlocs.txt")
districts_old <- dget("districts_old.txt")

st_crs(snowlocs) == st_crs(districts_old)

ggplot(districts_old) +
  geom_sf(aes(fill = DISTRICT_NAME)) +
  geom_sf(data = snowlocs) +
  geom_sf_text(data = snowlocs, aes(label = STATION_NUMBER), nudge_y = -10000)

# Why doesn't SNOW-WLF-DY get a district?
check <- st_join(snowlocs %>% select(STATION_NUMBER), 
                 districts_old %>% select(DISTRICT_NAME),
                 left = TRUE)
check
#  STATION_NUMBER                geom DISTRICT_NAME
#* <chr>                  <POINT [m]> <chr>        
#1 SNOW-WLF-DY    (381646.6 12597767) NA           
#2 SNOW-WLF-MK    (236571.6 12639511) Kenora       
#3 SNOW-WLF-PK    (570209.5 12772131) Sioux Lookout

Here's an image of the three example stations and three example districts: image

edzer commented 4 months ago

Your code gives me

Error in eval(ei, envir) : object 'test' not found 

please modify such that it runs.

Nova-Scotia commented 4 months ago

Sorry about that - fixed now.

edzer commented 4 months ago

the first feature of districts_old has an invalid geometry. You can correct it with st_make_valid(districts_old).

Nova-Scotia commented 4 months ago

Thanks! It's interesting that it fails completely when the geometry isn't valid - I didn't think to check for that (even though I've run into similar issues in the past - sigh). If a simple st_make_valid fixes the issue, I wonder if st_join could check for this and give a warning message if it sees that the results would be different if the geometry was valid? Maybe that would reduce the performance too much - but do you have any ideas on how the function might hint to the user this might be the case (even in the help files for st_join)?

edzer commented 4 months ago

I agree that a hint in the help for st_join might be good, and potentially in several other places.

Nova-Scotia commented 4 months ago

Thank you - hopefully it saves someone (including future me) some hair pulling!