paleolimbot / geos

Open Source Geometry Engine ('GEOS') R API
https://paleolimbot.github.io/geos/
Other
61 stars 8 forks source link

geos doesn't properly convert to sf geometry #80

Closed JosiahParry closed 1 year ago

JosiahParry commented 1 year ago

I'm not so sure what is happening but it appears that there is a challenge when translating from geos geometry to sf geometry.

Here's a reproducible example. It fetches geojson from ArcGIS online. Parses it into a string and reads it into geos geometry. This can be plotted jsut fine but it can't be plotted after being cast to an sf object.

library(httr2)

url <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Census_2020_Redistricting_Blocks/FeatureServer/0/query?where=1+%3D1+&objectIds=&time=&geometry=&geometryType=esriGeometryPolygon&inSR=&spatialRel=esriSpatialRelIntersects&resultType=none&distance=0.0&units=esriSRUnit_Meter&relationParam=&returnGeodetic=false&outFields=*&returnGeometry=true&returnCentroid=false&featureEncoding=esriDefault&multipatchOption=xyFootprint&maxAllowableOffset=&geometryPrecision=&outSR=&defaultSR=&datumTransformation=&applyVCSProjection=false&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnExtentOnly=false&returnQueryGeometry=false&returnDistinctValues=false&cacheHint=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&having=&resultOffset=&resultRecordCount=&returnZ=false&returnM=false&returnExceededLimitFeatures=true&quantizationParameters=&sqlFormat=none&f=pgeojson&token="

res <- request(url) |> 
  req_perform() |> 
  resp_body_string() 

res_geometry <- geos::geos_read_geojson(res) |> 
  geos::geos_unnest()

sf::st_as_sfc(res_geometry) |>
  plot()

#> Error in CPL_gdal_dimension(st_geometry(x), NA_if_empty) : 
#>   Not compatible with STRSXP: [type=NULL].

sf::st_read(res) |> 
  sf::st_geometry() |> 
  plot()
image
JosiahParry commented 1 year ago

It appears that this is because geos_unnest() sets the geometry type to "GEOMETRY" rather than "MULTIPOLYGON" per https://github.com/r-spatial/sf/issues/1443

The solution is to cast to the appropriate geometry type using

sf::st_as_sfc(res_geometry) |>
  st_cast("MULTIPOLYGON") |> 
  plot()
JosiahParry commented 1 year ago

Looking further, this seems to be an issue with wk

wk::wk_handle(res_geometry, wk::sfc_writer())

Geometry set for 2000 features 
Geometry type: GEOMETRY
Dimension:     XY
Bounding box:  xmin: -178.4436 ymin: 20.5 xmax: -156.6754 ymax: 28.51727
CRS:           NA
paleolimbot commented 1 year ago

sf is correct - there is indeed more than one type of geometry here (polygon and multipolygon). GEOS can do that for you too but it requires some trickery:

library(httr2)

url <- "https://services.arcgis.com/P3ePLMYs2RVChkJx/arcgis/rest/services/USA_Census_2020_Redistricting_Blocks/FeatureServer/0/query?where=1+%3D1+&objectIds=&time=&geometry=&geometryType=esriGeometryPolygon&inSR=&spatialRel=esriSpatialRelIntersects&resultType=none&distance=0.0&units=esriSRUnit_Meter&relationParam=&returnGeodetic=false&outFields=*&returnGeometry=true&returnCentroid=false&featureEncoding=esriDefault&multipatchOption=xyFootprint&maxAllowableOffset=&geometryPrecision=&outSR=&defaultSR=&datumTransformation=&applyVCSProjection=false&returnIdsOnly=false&returnUniqueIdsOnly=false&returnCountOnly=false&returnExtentOnly=false&returnQueryGeometry=false&returnDistinctValues=false&cacheHint=false&orderByFields=&groupByFieldsForStatistics=&outStatistics=&having=&resultOffset=&resultRecordCount=&returnZ=false&returnM=false&returnExceededLimitFeatures=true&quantizationParameters=&sqlFormat=none&f=pgeojson&token="

res <- request(url) |> 
  req_perform() |> 
  resp_body_string() 

res_geometry <- geos::geos_read_geojson(res) |> 
  geos::geos_unnest()

# it's because this GeoJSON doesn't "promote" single geometries to multi geometries?
# (sf does this after reading from GDAL I believe)
unique(geos::geos_type(res_geometry))
#> [1] "polygon"      "multipolygon"

head(res_geometry)
#> <geos_geometry[6]>
#> [1] <POLYGON [-178.44 28.328...-178.23 28.517]>     
#> [2] <MULTIPOLYGON [-178.33 28.386...-178.29 28.402]>
#> [3] <POLYGON [-170.63 25.506...-170.63 25.507]>     
#> [4] <POLYGON [-170.68 25.456...-170.57 25.557]>     
#> [5] <POLYGON [-171.81 25.707...-171.67 25.838]>     
#> [6] <POLYGON [-171.75 25.756...-171.73 25.782]>

is_polygon <- geos::geos_type(res_geometry) == "polygon"
res_geometry[is_polygon] <- 
  geos::geos_make_collection(
    res_geometry[is_polygon],
    "multipolygon", 
    seq_len(sum(is_polygon))
  )

# seems to do the trick?
head(res_geometry)
#> <geos_geometry[6]>
#> [1] <MULTIPOLYGON [-178.44 28.328...-178.23 28.517]>
#> [2] <MULTIPOLYGON [-178.33 28.386...-178.29 28.402]>
#> [3] <MULTIPOLYGON [-170.63 25.506...-170.63 25.507]>
#> [4] <MULTIPOLYGON [-170.68 25.456...-170.57 25.557]>
#> [5] <MULTIPOLYGON [-171.81 25.707...-171.67 25.838]>
#> [6] <MULTIPOLYGON [-171.75 25.756...-171.73 25.782]>

Created on 2022-12-06 with reprex v2.0.2

I do wonder if I'm generating invalid sf objects, though (or maybe any attempt to do what you're trying to do with a GEOMETRY typed column will fail that way?)/

JosiahParry commented 1 year ago

Ah, got ya. When I read the same geojson from sf it automatically casts to a multipolygon so i suspected that was the appropriate conversion type. But I see the point now. This was super super helpful! I suspect that this can be closed.

Using the "trickery" is exceptionallyyyyy helpful in going from geos -> sf and saves a lot of time. Made it into a bit of a help function below. I'll close this issue now! : )

image
make_sf_compat <- function(geometry) {

  all_types <- c(
    "linestring",
    "multilinestring",
    "point",
    "multipoint",
    "polygon",
    "multipolygon"
  )

  # note difference between geoms and geom
  geoms_types <- geos::geos_type(geometry)
  geom_types <- unique(geoms_types)

  type_index <- all_types %in% geom_types
  n_types <- sum(type_index)
  # if more than two types match automatically a GEOMTRY
  # if exactly one type then it is cast properly
  # both conditions don't need any manipulation
  if (n_types == 1 || n_types > 2) {
    return(sf::st_as_sfc(geometry))
  # if there is exactly two types of matches these are either multi
  # or incompatible matches
    # LINESTRING checks
  } else if (all(type_index[1:2])) {
    is_linestring <- geoms_types == all_types[1]

    geometry[is_linestring] <-
      geos::geos_make_collection(
        res_geometry[is_linestring],
        all_types[2],
        seq_len(sum(is_linestring))
      )

    return(sf::st_as_sfc(geometry))

    # POINT / MULTIPOINT check
  } else if (all(type_index[3:4])) {

    is_point <- geoms_types == all_types[3]

    geometry[is_point] <-
      geos::geos_make_collection(
        res_geometry[is_point],
        all_types[4],
        seq_len(sum(is_point))
      )

    return(sf::st_as_sfc(geometry))

    # POLYGON / MULTIPOLYGON check
  } else if (all(type_index[5:6])) {
    is_polygon <- geoms_types == all_types[5]

    geometry[is_polygon] <-
      geos::geos_make_collection(
        geometry[is_polygon],
        all_types[6],
        seq_len(sum(is_polygon))
      )

    return(sf::st_as_sfc(geometry))

  } else {
    return(sf::st_as_sfc(geometry))
  }
}