R-ArcGIS / arcgislayers

ArcGIS Location Services
http://r.esri.com/arcgislayers/
Apache License 2.0
39 stars 9 forks source link

Replace domains information #134

Open vidonne opened 8 months ago

vidonne commented 8 months ago

Is there a way to replace the value in a feature service with the domains values?

For example, when I run the following, I get code for the "iso3" column:

library(arcgislayers)

furl <- "https://gis.unhcr.org/arcgis/rest/services/core_v2/wrl_polbnd_int_15m_a_unhcr/FeatureServer/0"

pop_fl <- arc_open(furl)

arc_select(pop_fl, fields = c("iso3"), where = "iso3='SSD'")
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 2696545 ymin: 388463.4 xmax: 4001680 ymax: 1371522
#> Projected CRS: WGS 84 / Pseudo-Mercator
#>   iso3                       geometry
#> 1  SSD MULTIPOLYGON (((3797469 106...

But it would be good to be able to get the domains values associated like you can do with {esri2sf} package, as shown below:

library(esri2sf)

furl <- "https://gis.unhcr.org/arcgis/rest/services/core_v2/wrl_polbnd_int_15m_a_unhcr/FeatureServer/0"

esri2sf(furl, outFields = c("iso3"), where = "iso3='SSD'", replaceDomainInfo = TRUE)
#> Layer Type: Feature Layer
#> Geometry Type: esriGeometryPolygon
#> Service Coordinate Reference System: 3857
#> Output Coordinate Reference System: 4326
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 24.22348 ymin: 3.487471 xmax: 35.9477 ymax: 12.22673
#> Geodetic CRS:  WGS 84
#>          iso3                          geoms
#> 1 South Sudan MULTIPOLYGON (((34.11325 9....

Did I miss it somewhere in the documentation or is there an Web service related way to do so? I know that "iso3" is not the best use case for this but it's a simple example and we do have services that code some location or population type that would be good to be able to get the label and not the code out of the arcgislayer call.

Thanks for the great package and your support on this. Cedric

JosiahParry commented 8 months ago

Thanks, Cedric! There's no support for it yet. But this is really great feedback. I think the way that I'd prefer this be handled would be outside of the arc_select() function in an effort to keep the scope of each function as minimal as possible.

It would probably be something like domain_substitute(feature_layer, .data, vars = c("a", "b", "c")).

Here's how you can accomplish it today, though!

library(dplyr)
library(arcgislayers)

furl <- "https://gis.unhcr.org/arcgis/rest/services/core_v2/wrl_polbnd_int_15m_a_unhcr/FeatureServer/0"

pop_fl <- arc_open(furl)

res <- arc_select(pop_fl, fields = c("iso3"), n_max = 10)

# start replacing domains 
var_names <- setdiff(names(res), attr(res, "sf_column"))
non_null_domains <- list_fields(pop_fl) |> 
  filter(name %in% var_names, !is.null(domain)) |> 
  select(name, domain) |> 
  tibble::deframe() 

for (.x in non_null_domains) {
  field_name <- .x$name
  # create a lookup table
  lu <- .x$codedValues[,2:1] |> 
    tibble::deframe()

  # modify in place
  res[[field_name]] <- lu[res[[field_name]]]
}

res
#> Simple feature collection with 10 features and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: -6278153 ymin: -7018201 xmax: 18710210 ymax: 5957233
#> Projected CRS: WGS 84 / Pseudo-Mercator
#>                                       iso3                       geometry
#> 1             No code (ISO user specified) MULTIPOLYGON (((8835003 390...
#> 2             No code (ISO user specified) MULTIPOLYGON (((10836631 32...
#> 3                     Norfolk Island (AUS) MULTIPOLYGON (((18706427 -3...
#> 4             No code (ISO user specified) MULTIPOLYGON (((3693874 251...
#> 5             No code (ISO user specified) MULTIPOLYGON (((3970901 262...
#> 6             No code (ISO user specified) MULTIPOLYGON (((8788443 394...
#> 7                   Christmas Island (AUS) MULTIPOLYGON (((11768307 -1...
#> 8             No code (ISO user specified) MULTIPOLYGON (((3207325 106...
#> 9           Saint Pierre et Miquelon (FRA) MULTIPOLYGON (((-6271573 59...
#> 10 Heard Island and McDonald Islands (AUS) MULTIPOLYGON (((8195908 -70...

Created on 2024-01-31 with reprex v2.0.2

JosiahParry commented 8 months ago

There's probably a use case for modifying them from code to label and back. One use case would be getting data from an external source that needs to be appended or updated in a feature service. The data you get uses the labels and not the code.

Perhaps domain_encode() and domain_decode() could be provided?

vidonne commented 8 months ago

Thanks for the feedback and current solution, really helpful. I don't really have a strong position on the implementation but it would be a nice thing to have. Thanks again for the great work on this package.

dickoa commented 8 months ago

Another option is to encode domain using the labelled. It adds a new dep but provide a mechanism to switch between labels and values. It's also supported by many packages since {haven} uses it for labelled data from Stata, SPSS, SAS, etc.

With labelled columns, you'll have this type of output:

arc_select(pop_fl, fields = "iso3", where = "iso3='SSD'")
#> Simple feature collection with 1 feature and 1 field
#> Geometry type: MULTIPOLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 2696545 ymin: 388463.4 xmax: 4001680 ymax: 1371522
#> Projected CRS: WGS 84 / Pseudo-Mercator
#>   iso3                       geometry
#>  <chr+lbl>              <MULTIPOLYGON [m]> 
#> 1  SSD [South Sudan] MULTIPOLYGON (((3797469 106...

And you can play with methods like labelled::to_character/labelled::to_factor (applied to specific columns or to the whole data) to get the labels on-demand.

elipousson commented 7 months ago

FWIW - esri2sf has some code for handling domain information where available. I believe @jacpete contributed this code to the original version of esri2sf but I've never touched it myself: https://github.com/elipousson/esri2sf/blob/master/R/addDomainInfo.R