r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.32k stars 293 forks source link

Problem with MULTISURFACE feature #2257

Closed lvalnegri closed 10 months ago

lvalnegri commented 10 months ago

Hello everyone,

I was trying to bind polygons corresponding to Italian buildings from the Italian Military Office (project DBSN), they offer one "gdb" file for each of the 107 provinces, containing lots of different things, but I'm currently interested only in the edifc layer.

When trying to st_make_valid at the end of the process, I've received:

Error in scan(text = lst[[length(lst)]], quiet = TRUE) : 
  scan() expected 'a real', got 'ParseException:'
Error in (function (msg)  : ParseException: Unknown WKB type 12

After some trial and error, I found the problem with the feature 123010 of the file https://www.igmi.org/scaricamento_dbsn_database/v202308/PN_dbsn_7I9g4s.zip, which has a weird geometry: "MULTISURFACE Z (CURVEPOLYGON Z (COMPOUNDCURVE Z (CIRCULARSTRING Z (7068463 5081492 0, 7068473 5081483 0, 7068463 5081473 0, 7068454 5081483 0, 7068463 5081492 0))))"

As commented here I've tried any combination of st_cast, st_extract, st_as_sf(c) but to no avail. For the moment being I've just deleted it, with another one found in another province, also because I do think it's a mistake. Any suggestion how I can automate the fix in case I found it again in future?

Thanks

lvalnegri commented 10 months ago

For the deletion part, I've already answered myself:

y[-which(st_is(y, 'MULTISURFACE')), , drop = FALSE]
edzer commented 10 months ago

Thanks; see also #2231

lvalnegri commented 9 months ago

just a quick head up if anyone is interested. I discovered that those multisurfaces weren't mistakes at all, but specific artifacts. For example, in another layer there are lots of them in Sicily, wind turbines to be precise. As I needed to represent at least the base, I came up with the following:

surf2poli <- \(y){
    y <- y |> 
            st_geometry() |> 
            st_as_text() |> 
            gsub('.*\\(([0-9].*[0-9])\\).*', '\\1', x = _) |> 
            gsub(' 0', '', x = _)
    paste0('POLYGON ((', y, ',', gsub("^(.*?),.*", '\\1', y), '))') |> 
        as.data.frame() |> 
        setNames('geometry') |> 
        st_as_sf(wkt = "geometry", crs = 6875) |> 
        st_as_sfc()
}
rsbivand commented 9 months ago
library(sf)
PN <- st_read("Pordenone_dbsn.gdb", layer="edifc")
st_write(PN, "PN_edifc0a.gpkg")
gdal_utils("vectortranslate", "PN_edifc0a.gpkg", "PN_edifca.gpkg", options=c("-nlt", "CONVERT_TO_LINEAR", "-dim", "XY"))
PN1 <- st_read("PN_edifca.gpkg")

The Z values appear to be zero n the MULTISURFACE cases:

aa <- st_geometry_type(PN)
st_as_text(st_geometry(PN)[which(aa == "MULTISURFACE")])

If the other Z values matter, one could subset just those and convert to linear, or analyse further to see whether any Zvalues make sense.

PNc <- st_coordinates(st_geometry(PN)[-which(aa == "MULTISURFACE")])
summary(PNc[,"Z"])
#      Min.   1st Qu.    Median      Mean   3rd Qu.      Max. 
# -29997.00     26.00     42.00    -20.05    119.00   1865.00 

It seems as though "CONVERT_TO_LINEAR" does not work here for "XYZ" objects, so here flattening:

PNms <- PN[which(aa == "MULTISURFACE"),]
st_write(PNms, "PN_edifc0ms.gpkg")
gdal_utils("vectortranslate", "PN_edifc0ms.gpkg", "PN_edifcms.gpkg", options=c("-nlt", "CONVERT_TO_LINEAR", "-dim", "XY"))
PN1ms <- st_read("PN_edifcms.gpkg")
st_geometry(PN)[which(aa == "MULTISURFACE")] <- st_geometry(PN1ms)