ncss-tech / sharpshootR

Miscellaneous soil data management, summary, visualization, and conversion utilities to support soil survey.
http://ncss-tech.github.io/sharpshootR/
17 stars 3 forks source link

constantDensitySampling gives wrong output with MULTIPOLYGONS #26

Open dylanbeaudette opened 4 years ago

dylanbeaudette commented 4 years ago

constantDensitySampling() was designed to work with sp class objects which don't differentiate between POLYGON and MULTIPOLYGON types. Exploding the MULTIPOLYGONS into POLYGONS via sf is a simple work-around. Long-term, a more robust solution is required, likely switching to sf-class objects / methods.

Not really repeatable, but the basic idea.

library(rgdal)
library(sp)
library(sf)
library(soilDB)
library(sharpshootR)

# load as sp
x <- readOGR(dsn='e:/temp/HI-sampling/Hawaii', layer = 'MLRA_v51_HI')

# check, must be projected CRS
proj4string(x)

# feature ID
x$pID <- 1:nrow(x)

# convert to sf
x.sf <- st_as_sf(x)
x.sf.poly <- st_cast(x.sf, 'POLYGON')

# and back
x.sp.poly <- as(x.sf.poly, 'Spatial')

# sampling on MULTIPOLYGONS
s <- constantDensitySampling(x, min.samples = 0, n.pts.per.ac = 0.001)

# sampling on POLYGONS
x.sp.poly$pID <- 1:nrow(x.sp.poly)
s.2 <- constantDensitySampling(x.sp.poly, min.samples = 0, n.pts.per.ac = 0.001)

# save
writeOGR(s, dsn='e:/temp/HI-sampling', layer='samples', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
writeOGR(s.2, dsn='e:/temp/HI-sampling', layer='samples-2', driver = 'ESRI Shapefile', overwrite_layer = TRUE)
brownag commented 4 years ago

To me this looks like the shapefile contains one or more multipart features -- so the st_as_sf() result has MULTIPOLYGON geometry.

constantDensitySampling number of sample heuristics are based on total polygon area -- without respect to whether it is multipart or not. It was designed for "clean" SSURGO-topologically-correct (no multipart) features.

The spurious samples are a degenerate case of spsample operating on a multipart polygon. Converting to POLYGON with sf has the net effect of "exploding" the multipart feature, so that constantDensitySampling can catch that those are really small polygons. If you set the minimum samples to zero, and sampling density is low enough, they would be "ignored'.

As far as a clean way to do this with sp... nah. sf is the way to go. And explode your multipart features before sampling.

brownag commented 4 years ago

I'll add that this seems to work as intended on MULTIPOLYGON -- if those multipart features weren't so comically small compared to the big one we would see constant density samples across all parts (I predict) -- as we should

dylanbeaudette commented 4 years ago

Long-term lets move this over to sf and force iteration of POLYGONs.

brownag commented 4 years ago

Here is a slightly more obvious depiction of what is going on with MULTIPOLYGON feature input to constantDensitySampling. EDIT: I was thinking that there might have been an issue, but the previous code had an error in it.

image

library(sf)
library(sharpshootR)

# replace with any shapefile containing two POLYGON features
f <- st_read("E:/poly_test.shp")

# one MULTIPOLYGON
f.m <- f[1,]
geo <- st_combine(st_cast(f, "MULTIPOLYGON"))
f.m$geometry <- geo

# inspect
par(mfrow=c(1,2), mar=c(1,1,1,1))
plot(f$geometry)
f$pID <- 1:2
f.s <- constantDensitySampling(as_Spatial(f), min.samples = 0, n.pts.per.ac = 0.1)
points(f.s)
box()

plot(f.m$geometry)
f.m$pID <- 1
f.m.s <- constantDensitySampling(as_Spatial(f.m), min.samples = 0, n.pts.per.ac = 0.1)
points(f.m.s)
box()
brownag commented 4 years ago

I had made a nasty mistake:

# one MULTIPOLYGON
f.m <- f
geo <- st_combine(st_cast(f, "MULTIPOLYGON"))
f.m$geometry <- geo

versus

# one MULTIPOLYGON
f.m <- f[1,]
geo <- st_combine(st_cast(f, "MULTIPOLYGON"))
f.m$geometry <- geo

I was inadvertently duping the geometry -- creating overlapping features that got sampled "correctly". It does seem that depending on the way the multipolygons were generated, the constantDensitySampling results could vary.

dylanbeaudette commented 4 years ago

Thanks for testing this. I'll likely return to this after closing some aqp issues.

dylanbeaudette commented 4 years ago

Thanks for testing this. I'll likely return to this after closing some aqp issues.