Closed rsbivand closed 5 years ago
As requested by @rsbivand here is a short R script to replicate the issue I encountered while using ms_simplify
# Create a Polygons object containing Polygon objects with two points of contact
library(sp)
r1 <- rbind(c(3,1),c(1,2),c(2,3),c(1,4),c(3,5),c(5,4),c(4,3),c(5,2),c(3,1))
r2 <- r1; r2[,1] <- r2[,1]+4
Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)), data.frame(Example=c("Minimal")))
rgeos::gIsValid(SPDF)
# [1] TRUE
SPDF2 <- rmapshaper::ms_simplify(SPDF,keep=0.6,keep_shapes=TRUE)
rgeos::gIsValid(SPDF2)
# [1] FALSE
# Warning message:
# In RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid") :
# Self-intersection at or near point 5 2
Also attaching a plot comparing SPDF and SPDF2
Hopefully this is enough to illustrate the issue. The original postal code geometry (DE 40468) contained a similar region around the size of a motorway flyover, which I suspect is just an artifact of the way the geometry was generated. However, I would have to make further inquiries to find out how this was done, and to establish if I can share it externally.
This is great information - thanks @rsbivand and @phaines-rms. The issue is in the mapshaper
js library:
library(sp)
r1 <- rbind(c(3,1),c(1,2),c(2,3),c(1,4),c(3,5),c(5,4),c(4,3),c(5,2),c(3,1))
r2 <- r1; r2[,1] <- r2[,1]+4
Ps1 = Polygons(list(Polygon(r1),Polygon(r2)),ID=1)
SPDF = SpatialPolygonsDataFrame( SpatialPolygons(list(Ps1)), data.frame(Example=c("Minimal")))
rgeos::gIsValid(SPDF)
#> [1] TRUE
SPDF2 <- rmapshaper::ms_simplify(SPDF,keep=0.6,keep_shapes=TRUE, sys = TRUE)
rgeos::gIsValid(SPDF2)
#> Warning in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid"): Self-
#> intersection at or near point 5 2
#> [1] FALSE
library(rgdal)
#> rgdal: version: 1.4-4, (SVN revision 833)
#> Geospatial Data Abstraction Library extensions to R successfully loaded
#> Loaded GDAL runtime: GDAL 3.0.0, released 2019/05/05
#> Path to GDAL shared files:
#> GDAL binary built with GEOS: TRUE
#> Loaded PROJ.4 runtime: Rel. 6.1.1, July 1st, 2019, [PJ_VERSION: 611]
#> Path to PROJ.4 shared files: (autodetected)
#> Linking to sp version: 1.3-1
# Write to geojson
tf <- tempfile(fileext = ".geojson")
writeOGR(SPDF, tf, "GeoJSON", driver="GeoJSON")
## Check validity of written object (valid):
system(sprintf('ogrinfo %s -q -dialect sqlite -sql "select st_isvalid(geometry) from %s"',
tf, "GeoJSON"), intern = TRUE)
#> [1] ""
#> [2] "Layer name: SELECT"
#> [3] "OGRFeature(SELECT):0"
#> [4] " st_isvalid(geometry) (Integer) = 1"
#> [5] ""
## Simplify with mapshaper cli tool
system2("mapshaper", "--version", stderr = TRUE)
#> [1] "0.4.126"
tf2 <- tempfile(fileext = ".geojson")
system(sprintf("mapshaper %s -simplify 0.6 -o %s", tf, tf2))
## Check validity of simplified object (invalid):
system(sprintf('ogrinfo %s -q -dialect sqlite -sql "select st_isvalid(geometry) from %s"',
tf2, tools::file_path_sans_ext(basename(tf2))), intern = TRUE)
#> [1] ""
#> [2] "Layer name: SELECT"
#> [3] "OGRFeature(SELECT):0"
#> [4] " st_isvalid(geometry) (Integer) = 0"
#> [5] ""
## Read back into R and confirm it is invalid
SPDF3 <- readOGR(tf2)
#> OGR data source with driver: GeoJSON
#> Source: "/private/var/folders/2w/x5wq73f93yzgm7hjr_b_54q00000gp/T/Rtmp5E3Pqk/file15bfa2c071ae4.geojson", layer: "file15bfa2c071ae4"
#> with 1 features
#> It has 1 fields
rgeos::gIsValid(SPDF3)
#> Warning in RGEOSUnaryPredFunc(spgeom, byid, "rgeos_isvalid"): Self-
#> intersection at or near point 5 2
#> [1] FALSE
Created on 2019-08-19 by the reprex package (v0.3.0)
As you highlighted in the gdal
issue you linked to @rsbivand, invalid geometries are "allowed", and many operations can create them, so I'm not sure how much of a problem this is, although might be worth highlighting to the mapshaper
maintainer that it can create invalid geometries.
Thanks! I should think that alerting the mapshaper
maintainer is sensible. rmapshaper does import from sf, so simple checks of output geometries could be added as an option. You also suggest rgeos, so for sp objects rgeos could be used for the same checks if available.
This is the next in a series of issues affecting invalid geometries that have turned up this month, with https://github.com/mtennekes/tmap/issues/333 and https://github.com/r-spatial/sf/issues/1121 , so keeping objects SF-valid seems very likely to become more necessary. It would thus benefit everyone if code creating objects itself ensured their validity, even if until now invalid objects have worked in existing workflows.
Closing this in now, and opened https://github.com/mbloch/mapshaper/issues/352 in mapshaper
, and #90 here to implement your suggestions @rsbivand - thanks!
@ateucher : a head's up for https://github.com/r-spatial/sf/issues/1130 and its fix in https://github.com/OSGeo/gdal/issues/1787, wrt. https://stat.ethz.ch/pipermail/r-sig-geo/2019-August/027569.html and the original poster's comment in https://stat.ethz.ch/pipermail/r-sig-geo/2019-August/027580.html that the problem arose when trying to write output geometries from rmapshaper with the ESRI Shapefile driver. I'm asking for a simplified workflow, to check whether the invalid output polygons were also invalid when presented to the js library - please comment when we know how far back we can go to trace this.