r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.34k stars 298 forks source link

How to re-establish sfc-class after it has been removed? #586

Closed tiernanmartin closed 6 years ago

tiernanmartin commented 6 years ago

I discovered that nested sf objects don't survive the round-trip conversion when the sfc is included in the nesting:

st_read(system.file("shape/nc.shp", package = "sf")) %>% 
   mutate(ROW = row_number()) %>% 
   nest(-ROW) 

## # A tibble: 100 x 2
##      ROW          data
##    <int>        <list>
##  1     1 <sf [1 x 15]>
##  2     2 <sf [1 x 15]>
##  3     3 <sf [1 x 15]>
##  4     4 <sf [1 x 15]>
##  5     5 <sf [1 x 15]>
##  6     6 <sf [1 x 15]>
##  7     7 <sf [1 x 15]>
##  8     8 <sf [1 x 15]>
##  9     9 <sf [1 x 15]>
## 10    10 <sf [1 x 15]>
## # ... with 90 more rows
st_read(system.file("shape/nc.shp", package = "sf")) %>% 
   mutate(ROW = row_number()) %>% 
   nest(-ROW) %>%
   unnest()

## Warning in bind_rows_(x, .id): Vectorizing 'sfc_MULTIPOLYGON' elements may
## not preserve their attributes
## There were 50 or more warnings (use warnings() to see the first 50)

## # A tibble: 100 x 16
##      ROW  AREA PERIMETER CNTY_ CNTY_ID        NAME   FIPS FIPSNO CRESS_ID
##    <int> <dbl>     <dbl> <dbl>   <dbl>      <fctr> <fctr>  <dbl>    <int>
##  1     1 0.114     1.442  1825    1825        Ashe  37009  37009        5
##  2     2 0.061     1.231  1827    1827   Alleghany  37005  37005        3
##  3     3 0.143     1.630  1828    1828       Surry  37171  37171       86
##  4     4 0.070     2.968  1831    1831   Currituck  37053  37053       27
##  5     5 0.153     2.206  1832    1832 Northampton  37131  37131       66
##  6     6 0.097     1.670  1833    1833    Hertford  37091  37091       46
##  7     7 0.062     1.547  1834    1834      Camden  37029  37029       15
##  8     8 0.091     1.284  1835    1835       Gates  37073  37073       37
##  9     9 0.118     1.421  1836    1836      Warren  37185  37185       93
## 10    10 0.124     1.428  1837    1837      Stokes  37169  37169       85
## # ... with 90 more rows, and 7 more variables: BIR74 <dbl>, SID74 <dbl>,
## #   NWBIR74 <dbl>, BIR79 <dbl>, SID79 <dbl>, NWBIR79 <dbl>,
## #   geometry <list>

That's unfortunate but it seems to be a result of a limitation in nest: it calls bind_rows_() which doesn't preserve the sfc column attribute (see the reprex below).

Question: Is there a way to re-establish class = sfc for a column that has been stripped of its attributes?

Naively I thought unnest(nc_nested) %>% mutate(geometry = map(geometry, st_multipolygon) would do it, but it doesn't seem to work.

Reprex + Session info ``` r library(tidyverse) library(sf) ## Linking to GEOS 3.6.1, GDAL 2.2.0, proj.4 4.9.3 nc <- st_read(system.file("shape/nc.shp", package = "sf")) nc_nested <- nc %>% mutate(ROW = row_number()) %>% nest(-ROW) print(nc_nested) ## # A tibble: 100 x 2 ## ROW data ## ## 1 1 ## 2 2 ## 3 3 ## 4 4 ## 5 5 ## 6 6 ## 7 7 ## 8 8 ## 9 9 ## 10 10 ## # ... with 90 more rows # Attempts to unnest unnest(nc_nested) ## Warning in bind_rows_(x, .id): Vectorizing 'sfc_MULTIPOLYGON' elements may ## not preserve their attributes ## There were 50 or more warnings (use warnings() to see the first 50) ## # A tibble: 100 x 16 ## ROW AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID ## ## 1 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 ## 2 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 ## 3 3 0.143 1.630 1828 1828 Surry 37171 37171 86 ## 4 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 ## 5 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 ## 6 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 ## 7 7 0.062 1.547 1834 1834 Camden 37029 37029 15 ## 8 8 0.091 1.284 1835 1835 Gates 37073 37073 37 ## 9 9 0.118 1.421 1836 1836 Warren 37185 37185 93 ## 10 10 0.124 1.428 1837 1837 Stokes 37169 37169 85 ## # ... with 90 more rows, and 7 more variables: BIR74 , SID74 , ## # NWBIR74 , BIR79 , SID79 , NWBIR79 , ## # geometry unnest(nc_nested) %>% mutate(geometry = map(geometry, st_multipolygon)) %>% st_sf() ## Warning in bind_rows_(x, .id): Vectorizing 'sfc_MULTIPOLYGON' elements may ## not preserve their attributes ## There were 50 or more warnings (use warnings() to see the first 50) ## Error in st_sf(.): no simple features geometry column present ``` ``` r devtools::session_info() ## Session info ------------------------------------------------------------- ## setting value ## version R version 3.4.0 (2017-04-21) ## system x86_64, mingw32 ## ui RTerm ## language (EN) ## collate English_United States.1252 ## tz America/Los_Angeles ## date 2017-12-05 ## Packages ----------------------------------------------------------------- ## package * version date source ## assertthat 0.2.0 2017-04-11 CRAN (R 3.4.2) ## backports 1.1.0 2017-05-22 CRAN (R 3.4.0) ## base * 3.4.0 2017-04-21 local ## bindr 0.1 2016-11-13 CRAN (R 3.4.2) ## bindrcpp * 0.2 2017-06-17 CRAN (R 3.4.2) ## broom 0.4.2 2017-02-13 CRAN (R 3.4.0) ## cellranger 1.1.0 2016-07-27 CRAN (R 3.4.2) ## class 7.3-14 2015-08-30 CRAN (R 3.4.0) ## classInt 0.1-24 2017-04-16 CRAN (R 3.4.2) ## cli 1.0.0 2017-11-05 CRAN (R 3.4.2) ## colorspace 1.3-2 2016-12-14 CRAN (R 3.4.2) ## compiler 3.4.0 2017-04-21 local ## crayon 1.3.4 2017-10-30 Github (r-lib/crayon@b5221ab) ## datasets * 3.4.0 2017-04-21 local ## DBI 0.7 2017-06-18 CRAN (R 3.4.2) ## devtools 1.13.2 2017-06-02 CRAN (R 3.4.0) ## digest 0.6.12 2017-01-27 CRAN (R 3.4.0) ## dplyr * 0.7.4 2017-09-28 CRAN (R 3.4.2) ## e1071 1.6-8 2017-02-02 CRAN (R 3.4.2) ## evaluate 0.10 2016-10-11 CRAN (R 3.4.0) ## forcats * 0.2.0 2017-01-23 CRAN (R 3.4.2) ## foreign 0.8-67 2016-09-13 CRAN (R 3.4.0) ## ggplot2 * 2.2.1.9000 2017-12-02 Github (tidyverse/ggplot2@7b5c185) ## glue 1.2.0.9000 2017-12-05 Github (tidyverse/glue@69bc72c) ## graphics * 3.4.0 2017-04-21 local ## grDevices * 3.4.0 2017-04-21 local ## grid 3.4.0 2017-04-21 local ## gtable 0.2.0 2016-02-26 CRAN (R 3.4.2) ## haven 1.1.0 2017-07-09 CRAN (R 3.4.2) ## hms 0.3 2016-11-22 CRAN (R 3.4.2) ## htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0) ## httr 1.3.1 2017-08-20 CRAN (R 3.4.2) ## jsonlite 1.5 2017-06-01 CRAN (R 3.4.0) ## knitr 1.16 2017-05-18 CRAN (R 3.4.0) ## lattice 0.20-35 2017-03-25 CRAN (R 3.4.0) ## lazyeval 0.2.1 2017-10-29 CRAN (R 3.4.2) ## lubridate 1.7.1 2017-11-03 CRAN (R 3.4.2) ## magrittr 1.5 2014-11-22 CRAN (R 3.4.0) ## memoise 1.1.0 2017-04-21 CRAN (R 3.4.0) ## methods * 3.4.0 2017-04-21 local ## mnormt 1.5-5 2016-10-15 CRAN (R 3.4.1) ## modelr 0.1.1 2017-07-24 CRAN (R 3.4.2) ## munsell 0.4.3 2016-02-13 CRAN (R 3.4.2) ## nlme 3.1-131 2017-02-06 CRAN (R 3.4.0) ## parallel 3.4.0 2017-04-21 local ## pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.2) ## plyr 1.8.4 2016-06-08 CRAN (R 3.4.2) ## psych 1.7.8 2017-09-09 CRAN (R 3.4.2) ## purrr * 0.2.4.9000 2017-12-05 Github (tidyverse/purrr@62b135a) ## R6 2.2.2 2017-06-17 CRAN (R 3.4.0) ## Rcpp 0.12.14 2017-11-23 CRAN (R 3.4.2) ## readr * 1.1.1 2017-05-16 CRAN (R 3.4.2) ## readxl 1.0.0 2017-04-18 CRAN (R 3.4.2) ## reshape2 1.4.2 2016-10-22 CRAN (R 3.4.2) ## rlang 0.1.4 2017-11-05 CRAN (R 3.4.2) ## rmarkdown 1.8 2017-11-17 CRAN (R 3.4.2) ## rprojroot 1.2 2017-01-16 CRAN (R 3.4.0) ## rvest 0.3.2 2016-06-17 CRAN (R 3.4.2) ## scales 0.5.0.9000 2017-12-02 Github (hadley/scales@d767915) ## sf * 0.5-6 2017-12-04 Github (r-spatial/sf@315890c) ## stats * 3.4.0 2017-04-21 local ## stringi 1.1.6 2017-11-17 CRAN (R 3.4.2) ## stringr * 1.2.0 2017-02-18 CRAN (R 3.4.0) ## tibble * 1.3.4 2017-08-22 CRAN (R 3.4.2) ## tidyr * 0.7.2.9000 2017-12-05 Github (tidyverse/tidyr@efd9ea5) ## tidyselect 0.2.3 2017-11-06 CRAN (R 3.4.3) ## tidyverse * 1.2.1 2017-11-14 CRAN (R 3.4.2) ## tools 3.4.0 2017-04-21 local ## udunits2 0.13 2016-11-17 CRAN (R 3.4.1) ## units 0.4-6 2017-08-27 CRAN (R 3.4.2) ## utils * 3.4.0 2017-04-21 local ## withr 2.1.0.9000 2017-12-02 Github (jimhester/withr@fe81c00) ## xml2 1.1.1 2017-01-24 CRAN (R 3.4.2) ## yaml 2.1.14 2016-11-12 CRAN (R 3.4.0) ```
tiernanmartin commented 6 years ago

I'm still unclear about how to re-establish the sfc class after the attributes are dropped, but I have figured out a way to use cbind to perform the nesting round-trip (not pretty):

library(tidyverse) 
library(sf)
## Linking to GEOS 3.6.1, GDAL 2.2.0, proj.4 4.9.3

nc_nested <- 
  st_read(system.file("shape/nc.shp", package = "sf"))  %>%  
  mutate(ROW = row_number()) %>% 
  {cbind(
    data = st_set_geometry(.,NULL) %>% as_tibble %>% nest(-ROW) %>% select(data),
    geom = st_geometry(.))
  } %>% 
  as_tibble() %>% 
  st_as_sf()

nc_nested
## Simple feature collection with 100 features and 1 field
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## epsg (SRID):    4267
## proj4string:    +proj=longlat +datum=NAD27 +no_defs
## # A tibble: 100 x 2
##                 data          geometry
##               <list>  <simple_feature>
##  1 <tibble [1 x 14]> <MULTIPOLYGON...>
##  2 <tibble [1 x 14]> <MULTIPOLYGON...>
##  3 <tibble [1 x 14]> <MULTIPOLYGON...>
##  4 <tibble [1 x 14]> <MULTIPOLYGON...>
##  5 <tibble [1 x 14]> <MULTIPOLYGON...>
##  6 <tibble [1 x 14]> <MULTIPOLYGON...>
##  7 <tibble [1 x 14]> <MULTIPOLYGON...>
##  8 <tibble [1 x 14]> <MULTIPOLYGON...>
##  9 <tibble [1 x 14]> <MULTIPOLYGON...>
## 10 <tibble [1 x 14]> <MULTIPOLYGON...>
## # ... with 90 more rows

#Unnest

unnest(nc_nested)
## Simple feature collection with 100 features and 14 fields
## geometry type:  MULTIPOLYGON
## dimension:      XY
## bbox:           xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965
## epsg (SRID):    4267
## proj4string:    +proj=longlat +datum=NAD27 +no_defs
## # A tibble: 100 x 15
##     AREA PERIMETER CNTY_ CNTY_ID        NAME   FIPS FIPSNO CRESS_ID BIR74
##    <dbl>     <dbl> <dbl>   <dbl>      <fctr> <fctr>  <dbl>    <int> <dbl>
##  1 0.114     1.442  1825    1825        Ashe  37009  37009        5  1091
##  2 0.061     1.231  1827    1827   Alleghany  37005  37005        3   487
##  3 0.143     1.630  1828    1828       Surry  37171  37171       86  3188
##  4 0.070     2.968  1831    1831   Currituck  37053  37053       27   508
##  5 0.153     2.206  1832    1832 Northampton  37131  37131       66  1421
##  6 0.097     1.670  1833    1833    Hertford  37091  37091       46  1452
##  7 0.062     1.547  1834    1834      Camden  37029  37029       15   286
##  8 0.091     1.284  1835    1835       Gates  37073  37073       37   420
##  9 0.118     1.421  1836    1836      Warren  37185  37185       93   968
## 10 0.124     1.428  1837    1837      Stokes  37169  37169       85  1612
## # ... with 90 more rows, and 6 more variables: SID74 <dbl>, NWBIR74 <dbl>,
## #   BIR79 <dbl>, SID79 <dbl>, NWBIR79 <dbl>, geometry <simple_feature>
Session info ``` r devtools::session_info() ## Session info ------------------------------------------------------------- ## setting value ## version R version 3.4.0 (2017-04-21) ## system x86_64, mingw32 ## ui RTerm ## language (EN) ## collate English_United States.1252 ## tz America/Los_Angeles ## date 2017-12-05 ## Packages ----------------------------------------------------------------- ## package * version date source ## assertthat 0.2.0 2017-04-11 CRAN (R 3.4.2) ## backports 1.1.0 2017-05-22 CRAN (R 3.4.0) ## base * 3.4.0 2017-04-21 local ## bindr 0.1 2016-11-13 CRAN (R 3.4.2) ## bindrcpp * 0.2 2017-06-17 CRAN (R 3.4.2) ## broom 0.4.2 2017-02-13 CRAN (R 3.4.0) ## cellranger 1.1.0 2016-07-27 CRAN (R 3.4.2) ## class 7.3-14 2015-08-30 CRAN (R 3.4.0) ## classInt 0.1-24 2017-04-16 CRAN (R 3.4.2) ## cli 1.0.0 2017-11-05 CRAN (R 3.4.2) ## colorspace 1.3-2 2016-12-14 CRAN (R 3.4.2) ## compiler 3.4.0 2017-04-21 local ## crayon 1.3.4 2017-10-30 Github (r-lib/crayon@b5221ab) ## datasets * 3.4.0 2017-04-21 local ## DBI 0.7 2017-06-18 CRAN (R 3.4.2) ## devtools 1.13.2 2017-06-02 CRAN (R 3.4.0) ## digest 0.6.12 2017-01-27 CRAN (R 3.4.0) ## dplyr * 0.7.4 2017-09-28 CRAN (R 3.4.2) ## e1071 1.6-8 2017-02-02 CRAN (R 3.4.2) ## evaluate 0.10 2016-10-11 CRAN (R 3.4.0) ## forcats * 0.2.0 2017-01-23 CRAN (R 3.4.2) ## foreign 0.8-67 2016-09-13 CRAN (R 3.4.0) ## ggplot2 * 2.2.1.9000 2017-12-02 Github (tidyverse/ggplot2@7b5c185) ## glue 1.2.0.9000 2017-12-05 Github (tidyverse/glue@69bc72c) ## graphics * 3.4.0 2017-04-21 local ## grDevices * 3.4.0 2017-04-21 local ## grid 3.4.0 2017-04-21 local ## gtable 0.2.0 2016-02-26 CRAN (R 3.4.2) ## haven 1.1.0 2017-07-09 CRAN (R 3.4.2) ## hms 0.3 2016-11-22 CRAN (R 3.4.2) ## htmltools 0.3.6 2017-04-28 CRAN (R 3.4.0) ## httr 1.3.1 2017-08-20 CRAN (R 3.4.2) ## jsonlite 1.5 2017-06-01 CRAN (R 3.4.0) ## knitr 1.16 2017-05-18 CRAN (R 3.4.0) ## lattice 0.20-35 2017-03-25 CRAN (R 3.4.0) ## lazyeval 0.2.1 2017-10-29 CRAN (R 3.4.2) ## lubridate 1.7.1 2017-11-03 CRAN (R 3.4.2) ## magrittr 1.5 2014-11-22 CRAN (R 3.4.0) ## memoise 1.1.0 2017-04-21 CRAN (R 3.4.0) ## methods * 3.4.0 2017-04-21 local ## mnormt 1.5-5 2016-10-15 CRAN (R 3.4.1) ## modelr 0.1.1 2017-07-24 CRAN (R 3.4.2) ## munsell 0.4.3 2016-02-13 CRAN (R 3.4.2) ## nlme 3.1-131 2017-02-06 CRAN (R 3.4.0) ## parallel 3.4.0 2017-04-21 local ## pkgconfig 2.0.1 2017-03-21 CRAN (R 3.4.2) ## plyr 1.8.4 2016-06-08 CRAN (R 3.4.2) ## psych 1.7.8 2017-09-09 CRAN (R 3.4.2) ## purrr * 0.2.4.9000 2017-12-05 Github (tidyverse/purrr@62b135a) ## R6 2.2.2 2017-06-17 CRAN (R 3.4.0) ## Rcpp 0.12.14 2017-11-23 CRAN (R 3.4.2) ## readr * 1.1.1 2017-05-16 CRAN (R 3.4.2) ## readxl 1.0.0 2017-04-18 CRAN (R 3.4.2) ## reshape2 1.4.2 2016-10-22 CRAN (R 3.4.2) ## rlang 0.1.4 2017-11-05 CRAN (R 3.4.2) ## rmarkdown 1.8 2017-11-17 CRAN (R 3.4.2) ## rprojroot 1.2 2017-01-16 CRAN (R 3.4.0) ## rvest 0.3.2 2016-06-17 CRAN (R 3.4.2) ## scales 0.5.0.9000 2017-12-02 Github (hadley/scales@d767915) ## sf * 0.5-6 2017-12-04 Github (r-spatial/sf@315890c) ## stats * 3.4.0 2017-04-21 local ## stringi 1.1.6 2017-11-17 CRAN (R 3.4.2) ## stringr * 1.2.0 2017-02-18 CRAN (R 3.4.0) ## tibble * 1.3.4 2017-08-22 CRAN (R 3.4.2) ## tidyr * 0.7.2.9000 2017-12-05 Github (tidyverse/tidyr@efd9ea5) ## tidyselect 0.2.3 2017-11-06 CRAN (R 3.4.3) ## tidyverse * 1.2.1 2017-11-14 CRAN (R 3.4.2) ## tools 3.4.0 2017-04-21 local ## udunits2 0.13 2016-11-17 CRAN (R 3.4.1) ## units 0.4-6 2017-08-27 CRAN (R 3.4.2) ## utils * 3.4.0 2017-04-21 local ## withr 2.1.0.9000 2017-12-02 Github (jimhester/withr@fe81c00) ## xml2 1.1.1 2017-01-24 CRAN (R 3.4.2) ## yaml 2.1.14 2016-11-12 CRAN (R 3.4.0) ```
edzer commented 6 years ago

Does

x <- read_sf(system.file("shape/nc.shp", package = "sf")) %>% 
   mutate(ROW = row_number()) %>% 
   nest(-ROW)
unnest(x) %>% mutate(geometry = st_sfc(geometry)) %>% st_sf

help?

tiernanmartin commented 6 years ago

That's helpful and pretty close to the steps I'd expect.

I want to be able to access the sfc within the nested tibble object - just like I would any other column. Your suggestion pointed me in the right direction - this works the way I expected it to:

st_read(system.file("shape/nc.shp", package = "sf")) %>% 
  mutate(ROW = row_number()) %>% 
  nest(-ROW) %>% 
  mutate(NAME = map_chr(data, "NAME"),
         geometry = map(data, "geometry") %>% purrr::flatten %>% st_sfc) %>% 
  st_sf 

flatten() removes a single layer of hierarchy from data$geometry and st_sfc() re-establishes the class.