r-spatial / lwgeom

bindings to the liblwgeom library
https://r-spatial.github.io/lwgeom/
58 stars 23 forks source link

st_snap_to_grid #13

Closed jeffreyhanson closed 6 years ago

jeffreyhanson commented 6 years ago

Hi,

I'm submitting this pull request to add the functionality for snapping geometries to grids (using the proposed function st_snap_to_grid). This function aims to mimic the ST_SnapToGrid function in PostGIS. It is merely a thin wrapper to the lwgeom_grid_in_place function in lblwgeom.

The motivation behind this pull request is two-fold. Firstly, geometries with high precision can cause problems during geoprocessing, which may not necessarily be resolved using lwgeom::st_make_valid, and so snapping geometries to a grid can be an effective strategy for resolving these issues (see below for example, note that this has also been observed in equivalent PostGIS functions: 1, 2, 3). Secondly, snapping geometries to a grid might be a computationally cheap method for simplifying complex geometries (though, as far as I know, this remains to be tested).

Below is an example showing the primary motivation for this function. First, we will fetch some particularly messy spatial data from Protected Planet.

# install sf development version if needed
# devtools::install_github("r-spatial/sf")

# install lwgeom branch if needed
# devtools::install_github("jeffreyhanson/lwgeom@st_snap_to_grid")

# load packages
library(sf)
## Linking to GEOS 3.5.1, GDAL 2.2.2, proj.4 4.9.2
library(dplyr)
## 
## Attaching package: 'dplyr'

## The following objects are masked from 'package:stats':
## 
##     filter, lag

## The following object is masked from 'package:testthat':
## 
##     matches

## The following objects are masked from 'package:base':
## 
##     intersect, setdiff, setequal, union
library(lwgeom)
## Linking to liblwgeom 2.5.0dev r16016, GEOS 3.5.1, proj.4 4.9.2
# fetch data
download_url <- paste0("https://www.protectedplanet.net/downloads/",
                       "WDPA_Dec2017_MLT?type=shapefile")
zip_file <- file.path(tempdir(), basename(httr::HEAD(download_url)$url))
result <- httr::GET(download_url, httr::write_disk(zip_file, overwrite = TRUE))
zip_dir <- file.path(tempdir(), "MLT")
dir.create(zip_dir, showWarnings = FALSE, recursive = TRUE)
utils::unzip(zip_file, exdir = zip_dir)
shp_path <- dir(zip_dir, "^.*polygons\\.shp$", full.names = TRUE)

# load data
x <- sf::read_sf(shp_path)

# sort data by year established
x$STATUS_YR[x$STATUS_YR == 0] <- NA_real_
x <- x %>% arrange(STATUS_YR)

# print data
print(x)
## Simple feature collection with 284 features and 28 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: 13.7 ymin: 35.42 xmax: 14.94 ymax: 36.3
## epsg (SRID):    4326
## proj4string:    +proj=longlat +datum=WGS84 +no_defs
## # A tibble: 284 x 29
##    WDPAID WDPA_PID PA_DEF                                            NAME
##     <dbl>    <chr>  <chr>                                           <chr>
##  1 174757   174757      1      Il-Ġonna tal-Mall, Floriana (Siġar Antiki)
##  2 174758   174758      1                 Bidnija, Wardija (Żebbuġ Antik)
##  3 194415   194415      1           Il-Ġonna ta' San Anton (Siġar Antiki)
##  4 194416   194416      1                       Il-Buskett (Siġar Antiki)
##  5 194417   194417      1 Il-Wied ta' l-Imġiebaħ, Mellieħa (Ballut Antik)
##  6 194418   194418      1            Il-Ballut tal-Wardija (Ballut Antik)
##  7 194420   194420      1                                          Filfla
##  8 174737   174737      1                      Il-Baħar ta' Madwar Filfla
##  9 194423   194423      1                            Il-Ġebla tal-Ġeneral
## 10 194425   194425      1     Il-Gżejjer Selmunett (Gżejjer ta' San Pawl)
## # ... with 274 more rows, and 25 more variables: ORIG_NAME <chr>,
## #   DESIG <chr>, DESIG_ENG <chr>, DESIG_TYPE <chr>, IUCN_CAT <chr>,
## #   INT_CRIT <chr>, MARINE <chr>, REP_M_AREA <dbl>, GIS_M_AREA <dbl>,
## #   REP_AREA <dbl>, GIS_AREA <dbl>, NO_TAKE <chr>, NO_TK_AREA <dbl>,
## #   STATUS <chr>, STATUS_YR <dbl>, GOV_TYPE <chr>, OWN_TYPE <chr>,
## #   MANG_AUTH <chr>, MANG_PLAN <chr>, VERIF <chr>, METADATAID <dbl>,
## #   SUB_LOC <chr>, PARENT_ISO <chr>, ISO3 <chr>, geometry <simple_feature>

Now, we will attempt to clean the data by (i) fixing invalid geometries using existing sf and lwgeom functions, and (ii) removing overlapping geometries.

# attempt to clean data
(try({
  y <- x %>% sf::st_set_precision(1e+10) %>%
             lwgeom::st_make_valid() %>% # attempt to fix geometries
             sf::st_difference()
}))
## [1] "Error in CPL_nary_difference(x) : \n  Evaluation error: TopologyException: found non-noded intersection between LINESTRING (14.3435 35.9261, 14.3434 35.926) and LINESTRING (14.3435 35.926, 14.3435 35.926) at 14.34345491544504 35.926048414602668.\n"
## attr(,"class")
## [1] "try-error"
## attr(,"condition")
## <Rcpp::eval_error in CPL_nary_difference(x): Evaluation error: TopologyException: found non-noded intersection between LINESTRING (14.3435 35.9261, 14.3434 35.926) and LINESTRING (14.3435 35.926, 14.3435 35.926) at 14.34345491544504 35.926048414602668.>

As we can see, this attempt to clean the data did not work. Now, we will try adding in the proposed lwgeom::st_snap_to_grid function to the data cleaning process.

# clean data
y <- x %>% sf::st_set_precision(1e+10) %>%
           lwgeom::st_make_valid() %>% # attempt to fix geometries
           sf::st_transform(3395) %>% # transform to Mercator CRS
           lwgeom::st_snap_to_grid(15) %>% # snap to 15m grid
           lwgeom::st_make_valid() %>% # attempt to fix geometries
           sf::st_difference()

# print result
print(y)
## Simple feature collection with 230 features and 28 fields
## geometry type:  GEOMETRY
## dimension:      XY
## bbox:           xmin: 1525080 ymin: 4196340 xmax: 1663110 ymax: 4316685
## epsg (SRID):    3395
## proj4string:    +proj=merc +lon_0=0 +k=1 +x_0=0 +y_0=0 +datum=WGS84 +units=m +no_defs
## First 10 features:
##    WDPAID WDPA_PID PA_DEF                                            NAME
## 1  174757   174757      1      Il-Ġonna tal-Mall, Floriana (Siġar Antiki)
## 2  174758   174758      1                 Bidnija, Wardija (Żebbuġ Antik)
## 3  194415   194415      1           Il-Ġonna ta' San Anton (Siġar Antiki)
## 4  194416   194416      1                       Il-Buskett (Siġar Antiki)
## 5  194417   194417      1 Il-Wied ta' l-Imġiebaħ, Mellieħa (Ballut Antik)
## 6  194418   194418      1            Il-Ballut tal-Wardija (Ballut Antik)
## 7  194420   194420      1                                          Filfla
## 8  174737   174737      1                      Il-Baħar ta' Madwar Filfla
## 9  194423   194423      1                            Il-Ġebla tal-Ġeneral
## 10 194425   194425      1     Il-Gżejjer Selmunett (Gżejjer ta' San Pawl)
##                                          ORIG_NAME
## 1       Il-Ġonna tal-Mall, Floriana (Siġar Antiki)
## 2                  Bidnija, Wardija (Żebbuġ Antik)
## 3            Il-Ġonna ta' San Anton (Siġar Antiki)
## 4                        Il-Buskett (Siġar Antiki)
## 5  Il-Wied ta' l-Imġiebaħ, Mellieħa (Ballut Antik)
## 6             Il-Ballut tal-Wardija (Ballut Antik)
## 7                                           Filfla
## 8                       Il-Baħar ta' Madwar Filfla
## 9                             Il-Ġebla tal-Ġeneral
## 10     Il-Gżejjer Selmunett (Gżejjer ta' San Pawl)
##                                                        DESIG
## 1  List of Historical Trees Having an Antiquarian Importance
## 2  List of Historical Trees Having an Antiquarian Importance
## 3  List of Historical Trees Having an Antiquarian Importance
## 4  List of Historical Trees Having an Antiquarian Importance
## 5  List of Historical Trees Having an Antiquarian Importance
## 6  List of Historical Trees Having an Antiquarian Importance
## 7                                  Riserva Naturali (Filfla)
## 8                                              Żona Projbita
## 9                                 Riserva Naturali (Gżejjer)
## 10                                Riserva Naturali (Gżejjer)
##                                                    DESIG_ENG DESIG_TYPE
## 1  List of Historical Trees Having an Antiquarian Importance   National
## 2  List of Historical Trees Having an Antiquarian Importance   National
## 3  List of Historical Trees Having an Antiquarian Importance   National
## 4  List of Historical Trees Having an Antiquarian Importance   National
## 5  List of Historical Trees Having an Antiquarian Importance   National
## 6  List of Historical Trees Having an Antiquarian Importance   National
## 7                                    Nature Reserve (Filfla)   National
## 8        No Berthing Zone/No Entry Zone except for Fisheries   National
## 9                                   Nature Reserve (Islands)   National
## 10                                  Nature Reserve (Islands)   National
##    IUCN_CAT       INT_CRIT MARINE REP_M_AREA GIS_M_AREA   REP_AREA
## 1       III Not Applicable      0    0.00000    0.00000  0.0110000
## 2       III Not Applicable      0    0.00000    0.00000  0.0690000
## 3       III Not Applicable      0    0.00000    0.00000  0.0650000
## 4       III Not Applicable      0    0.00000    0.00000  0.0010000
## 5       III Not Applicable      0    0.00000    0.00000  0.0030000
## 6       III Not Applicable      0    0.00000    0.00000  0.0110000
## 7        Ia Not Applicable      0    0.00000    0.00000  0.0658768
## 8        VI Not Applicable      2   12.96806   12.99575 12.9680567
## 9        Ia Not Applicable      0    0.00000    0.00000  0.0060000
## 10       Ia Not Applicable      0    0.00000    0.00000  0.1069999
##        GIS_AREA        NO_TAKE NO_TK_AREA     STATUS STATUS_YR
## 1  1.111493e-02 Not Applicable          0 Designated      1933
## 2  6.907417e-02 Not Applicable          0 Designated      1933
## 3  6.521116e-02 Not Applicable          0 Designated      1933
## 4  6.869653e-04 Not Applicable          0 Designated      1933
## 5  3.203060e-03 Not Applicable          0 Designated      1933
## 6  1.061708e-02 Not Applicable          0 Designated      1933
## 7  6.601743e-02 Not Applicable          0 Designated      1988
## 8  1.299575e+01   Not Reported          0 Designated      1990
## 9  6.051646e-03 Not Applicable          0 Designated      1992
## 10 1.077914e-01 Not Applicable          0 Designated      1993
##                                  GOV_TYPE     OWN_TYPE
## 1  Federal or national ministry or agency Not Reported
## 2  Federal or national ministry or agency Not Reported
## 3  Federal or national ministry or agency Not Reported
## 4  Federal or national ministry or agency Not Reported
## 5  Federal or national ministry or agency Not Reported
## 6  Federal or national ministry or agency Not Reported
## 7  Federal or national ministry or agency Not Reported
## 8  Federal or national ministry or agency Not Reported
## 9  Federal or national ministry or agency Not Reported
## 10 Federal or national ministry or agency Not Reported
##                                                                                       MANG_AUTH
## 1  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 2  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 3  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 4  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 5  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 6  Heritage Malta, Office of the Prime Minister, Auberge de Castille, Valletta, VLT 2000, Malta
## 7     Environment and Resources Authority, Hexagon House, Spencer Hill, Marsa,  MRS 1441, Malta
## 8                              Transport Malta, Malta Transport Centre, Marsa, MRS 1917, Malta.
## 9     Environment and Resources Authority, Hexagon House, Spencer Hill, Marsa,  MRS 1441, Malta
## 10    Environment and Resources Authority, Hexagon House, Spencer Hill, Marsa,  MRS 1441, Malta
##       MANG_PLAN          VERIF METADATAID      SUB_LOC PARENT_ISO ISO3
## 1  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 2  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 3  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 4  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 5  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 6  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 7  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 8  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 9  Not Reported State Verified       1839 Not Reported        MLT  MLT
## 10 Not Reported State Verified       1839 Not Reported        MLT  MLT
##                          geometry
## 1  MULTIPOLYGON (((1614525 426...
## 2  MULTIPOLYGON (((1603635 426...
## 3  MULTIPOLYGON (((1608420 426...
## 4  MULTIPOLYGON (((1602690 425...
## 5  MULTIPOLYGON (((1600485 427...
## 6  GEOMETRYCOLLECTION (MULTIPO...
## 7  GEOMETRYCOLLECTION (MULTIPO...
## 8  POLYGON ((1606500 4246470, ...
## 9  GEOMETRYCOLLECTION (POLYGON...
## 10 GEOMETRYCOLLECTION (MULTIPO...

This approach is not ideal, given that we had to snap the data to a 15 meter grid and introduce a potentially non-negligible level of spatial error into the data set. I had tried smaller grid sizes but this did not resolve the error encountered in sf::st_difference(). Despite this limitation, however, the data is now "clean" and can actually be used for subsequent processing.

I would love to hear if anyone has any suggestions for improving this pull request or for cleaning this data set using the existing functionality provided by sf or lwgeom.

jeffreyhanson commented 6 years ago

@mdsumner had the great idea of using sf::st_simplify, and it would appear that it can resolve the problem when simplifying the data to 100 m.

y <- x %>% sf::st_set_precision(1e+10) %>%
           lwgeom::st_make_valid() %>% # attempt to fix geometries
           sf::st_transform(3395) %>% # transform to Mercator crs
           sf::st_simplify(dTolerance = 100) %>% # simplify to 100 m tolerance
           lwgeom::st_make_valid() %>%
           sf::st_difference()

So, I'm not sure if this pull request is still warranted. I'll leave this PR open until someone else has looked at this, but I'm happy to close it.

jeffreyhanson commented 6 years ago

Thank you very much for merging this and fixing the integration problems!

edzer commented 6 years ago

Thanks for contributing this, Jeffrey.

Yeah, it's a bit a tricky one: lwgeom_grid or lwgeom_grid_in_place are not "exported" by liblwgeom, so if you depend on liblwgeom to be a system library (e.g. coming with ubuntugis-unstable, or with PostGIS installation) it's "not there": header file missing, not in the compiled code (on my ubuntu 17.04). This may have to do with liblwgeom not having such a clear status as autonomous library, the way gdal, geos and proj.4 are for instance - it's essentially a PostGIS component we've pulled out. This might improve over time.