r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.33k stars 293 forks source link

Idea for a simple function to remove polygon slivers #547

Closed dkyleward closed 6 years ago

dkyleward commented 6 years ago

I get a lot of zonal/polygon shapefiles from clients that have numerous topological issues. The most common (as in it's in all of them) is the existence of gaps (or slivers) between polygon boundaries. It took me a little while to understand the various sf tools I need to fix it, but I finally did it with the following:

shp <- st_buffer(shp, .0001)
shp <- st_difference(shp, shp)

Would it be worthwhile to wrap this in something like an st_squeeze() or st_slivers()? There are tons of google searches for it. One thing I did notice is that the above method is pretty slow for shapefiles with lots of polygons.

If there is a better way (I'm only just getting familiar with geojson and topojson) then I'm all ears. Maybe you even already have a function that I missed.

BTW, can sf read/write/convert geo/topojson? Thanks!

Zedseayou commented 6 years ago

what does "between polygon boundaries" mean? As in you have polygons that should share an edge but actually have a small gap?

I am not sure what the code you provided is supposed to do; with a normal polygon (and I just tried) it buffers it by a small amount (so it gets a bit bigger) and then subtracts it from itself, leaving no geometry behind.

Yes, st_read and st_write deal with geojson well, it will guess the driver if you use a dsn (path) that ends in .geojson. List of extensions and associated drivers in the second vignette.

dkyleward commented 6 years ago

Yep - small gaps.

Take the NC county shape in the sf package (imagining it had some gaps). If you st_buffer() %>% st_difference each polygon will get buffered (filling in the gaps) and then the st_difference cleans everything up so that the shared boundary is the same. In an effort to make that clearer, I'll take a shapefile I've been given, which is also a bunch of adjacent polygons. I can check for gaps by combining them into a single polygon:

taz[taz$TAZ < 20,] %>%
  st_combine() %>%
  st_union() %>%
  plot()
screen shot 2017-11-11 at 3 56 47 pm

Instead of getting a nice, clean polygon, the gaps between the original polygons are evident. If instead I start with st_buffer and st_difference:

test <- taz[taz$TAZ < 20,] %>% st_buffer(.00001)
st_difference(test, test) %>%
  st_combine() %>%
  st_union() %>%
  plot()
screen shot 2017-11-11 at 4 01 28 pm

The buffer/difference made sure that there were no gaps between each polygon. I've heard this referred to as "squeezing" polygons. It would be really handy to have an

st_squeeze(sf = NULL, buffer_dist = .0001)

Anyway, thanks for your time.

RE st_read/write and json: Awesome!

edzer commented 6 years ago

I can see this problem, and agree that a generic solution would be welcome; however, the problem is not specific to this R package, and a solution also supported by e.g. qgis, postgis or spatialite would make life much easier here. Also, package lwgeom has functions st_make_valid and st_snap, which might be useful in some way to this problem. Feel free to reopen here once you have more than an idea.

mikoontz commented 5 years ago

For posterity:

I was using the squeeze method until I needed to do an intersection with the resulting unioned polygon and found that it wasn't valid. I tried using the st_snap() function instead and got the desired results with a bonus of also getting a valid polygon:

shp <-
shp %>%
  st_snap(x = ., y = ., tolerance = 0.0001) %>%
  st_union()
njlyon0 commented 2 years ago

I had an issue in this vein because of lakes missing from polygons of watersheds and this stack overflow post helped me.

To simplify things, the answer (for me) was as follows: nngeo::st_remove_holes(sf_object)

With the caveat that using this changes the "geometry" feature to be called simply "geom" so you'll need to account for that downstream if you're referencing "geometry" by name

Edit:

I took @edzer 's advice and posted an issue with nngeo about this (see here: https://github.com/michaeldorman/nngeo/issues/23) and they resolved it! The development version now no longer renames the geometry feature

edzer commented 2 years ago

With the caveat that using this changes the "geometry" feature to be called simply "geom" so you'll need to account for that downstream if you're referencing "geometry" by name

You could raise that as an issue with nngeo, of course.

fgoerlich commented 2 years ago

Maybe a tool along these lines, about smart disolving, would be useful for removing slivers and much more generally.