R-ArcGIS / arcgisgeocode

Utilize public or private ArcGIS Geocoder Services from R. Provides reverse geocoding, candidate search, single address, and batch geocoding.
http://r.esri.com/arcgisgeocode/
Apache License 2.0
40 stars 6 forks source link

Geocoders with custom fields won't work #20

Closed aaronkrusniak closed 5 months ago

aaronkrusniak commented 5 months ago

Hi there! Started testing this package today and found a bug (or rather it seems like the code may just not be implemented yet?)

I'm trying to geocode a few addresses using an enterprise geocoder with custom fields:

library(arcgis)
library(arcgisbinding)
library(arcgisgeocode)
library(tidyverse)

arc.check_portal()
set_arc_token(auth_binding())

music_venues <- tribble(
  ~Name,              ~Address,
  "Aragon Ballroom",  "1106 W. Lawrence Ave.",
  "House of Blues",   "329 N. Dearborn St.",
  "Bottom Lounge",    "1375 W. Lake St.",
  "The Vic",          "3145 N. Sheffield Ave.",
  "Park West",        "322 W. Armitage Ave.",
  "Thalia Hall",      "1807 S. Allport St.",
  "Lincoln Hall",     "2424 N. Lincoln Ave.",
  "Schubas Tavern",   "3159 N. Southport Ave."
)

chigeocode <- geocode_server("https://maps.chicago.gov/arcgis/rest/services/Chicago_Addresses/GeocodeServer")

venues_geocoded <- geocode_addresses(single_line = music_venues$Address,
                                     for_storage = TRUE,
                                     geocoder = chigeocode)

And this last call to geocode_addresses is what generates the error:

#> Error in parse_locations_res(string, use_custom_json_processing) : 
#>   argument "geocoder" is missing, with no default

I took a peek at your source code to see if I could track down a lead, and I think I found one. In the source for core-batch-geocode.R, you implement some handling for custom fields (lines 283-295):

 # Before we can process the responses, we must know if
  # the locator has custom fields. If so, we need to use
  # RcppSimdJson and _not_ the Rust based implementation
  use_custom_json_processing <- has_custom_fields(geocoder)

  # TODO Handle errors
  all_results <- lapply(all_resps, function(.resp) {
    string <- httr2::resp_body_string(.resp)
    parse_locations_res(
      string,
      use_custom_json_processing
    )
  })

And the body of parse_locations_res() looks like this (lines 328-354):

parse_locations_res <- function(
    string,
    has_custom_fields,
    n,
    geocoder,
    call = rlang::caller_env()) {
  check_bool(has_custom_fields, allow_na = FALSE, allow_null = FALSE, call = call)

  if (has_custom_fields) {
    res_list <- parse_custom_loc_json(string, geocoder, n, call)
  } else {
    res_list <- parse_location_json(string)
  }

  if (is.null(res_list)) {
    return(NULL)
  }

  res <- res_list[["attributes"]]

  geometry <- sf::st_sfc(
    res_list[["locations"]],
    crs = parse_wkid(res_list$sr$wkid)
  )
  # craft the {sf} object
  sf::st_sf(res, geometry)
}

So, it looks to me like the call to parse_locations_res() on line 291 isn't providing a complete list of arguments. That's fine when use_custom_json_processing is FALSE, but when it's TRUE (i.e. when the geocoder has custom fields), parse_locations_res() will try to call parse_custom_loc_json() using geocoder and n arguments that it hasn't been provided.

Looks like this may also be somewhat related to #8.

Sorry if this ends up being bad information— not sure of all the technical details in your stack— but hopefully I am at least pointing you in the right direction or giving you a lead to work off of! Thanks so much for making this package, I think this'll be a huge boon to my organization once we can get it working with our enterprise geocoder :)

JosiahParry commented 5 months ago

Thank you! I have a very limited number of custom geocoders to work with personally and i figured an issue might arise with respect to custom geocoding fields. The custom json parsing is implemented but I suspect I may have written a bug like you point out somewhere.

It appears that your geocoding service is not available publicly. If possible, would you be able to share what your field metadata looks like?

This can be done via

chigeocoder$candidateFields |> dput()
aaronkrusniak commented 5 months ago

Sure! It's not pretty, but it looks like this. I'll note that in our case, our custom fields are named all the same as the candidate fields, but they are for a candidate street intersection.

structure(list(name = c("Loc_name", "Shape", "Status", "Score", 
"Match_addr", "LongLabel", "ShortLabel", "Addr_type", "Type", 
"PlaceName", "Place_addr", "Phone", "URL", "Rank", "AddBldg", 
"AddNum", "AddNumFrom", "AddNumTo", "AddRange", "Side", "StPreDir", 
"StPreType", "StName", "StType", "StDir", "BldgType", "BldgName", 
"LevelType", "LevelName", "UnitType", "UnitName", "SubAddr", 
"StAddr", "Block", "Sector", "Nbrhd", "District", "City", "MetroArea", 
"Subregion", "Region", "RegionAbbr", "Territory", "Zone", "Postal", 
"PostalExt", "Country", "CntryName", "LangCode", "Distance", 
"X", "Y", "DisplayX", "DisplayY", "Xmin", "Xmax", "Ymin", "Ymax", 
"ExInfo", "User_fld"), type = c("esriFieldTypeString", "esriFieldTypeGeometry", 
"esriFieldTypeString", "esriFieldTypeDouble", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeDouble", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeString", 
"esriFieldTypeString", "esriFieldTypeString", "esriFieldTypeDouble", 
"esriFieldTypeDouble", "esriFieldTypeDouble", "esriFieldTypeDouble", 
"esriFieldTypeDouble", "esriFieldTypeDouble", "esriFieldTypeDouble", 
"esriFieldTypeDouble", "esriFieldTypeDouble", "esriFieldTypeString", 
"esriFieldTypeString"), alias = c("Loc_name", "Shape", "Status", 
"Score", "Match_addr", "LongLabel", "ShortLabel", "Addr_type", 
"Type", "PlaceName", "Place_addr", "Phone", "URL", "Rank", "BuildingName", 
"AddressNumber", "AddNumFrom", "AddNumTo", "AddressRange", "Side", 
"StPreDir", "StPreType", "StName", "StType", "StDir", "BldgType", 
"BldgName", "LevelType", "LevelName", "UnitType", "UnitName", 
"SubAddress", "StAddr", "Block", "Sector", "Neighborhood", "District", 
"City", "MetroArea", "Subregion", "Region", "RegionAbbr", "Territory", 
"Zone", "Postal", "PostalExt", "Country", "CountryName", "LangCode", 
"Distance", "X", "Y", "DisplayX", "DisplayY", "Xmin", "Xmax", 
"Ymin", "Ymax", "ExtraInfo", "User_fld"), required = c(FALSE, 
TRUE, TRUE, TRUE, TRUE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, FALSE, 
FALSE, FALSE, FALSE, FALSE), length = c(14L, NA, 1L, NA, 500L, 
500L, 500L, 20L, 50L, 200L, 500L, 25L, 250L, NA, 125L, 50L, 50L, 
50L, 100L, 1L, 5L, 50L, 125L, 30L, 20L, 20L, 50L, 20L, 50L, 20L, 
50L, 250L, 300L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 120L, 
50L, 120L, 100L, 20L, 10L, 30L, 100L, 5L, NA, NA, NA, NA, NA, 
NA, NA, NA, NA, 500L, 255L)), row.names = c(NA, 60L), class = "data.frame")
JosiahParry commented 5 months ago

@aaronkrusniak thank you! I was able to create a reprex:

library(arcgisutils)
library(arcgisgeocode)

set_arc_token(auth_user())
geocoder <- geocode_server("https://datahub.esriuk.com/arcgis/rest/services/gb_locators/bng_locator/GeocodeServer")
arcgisgeocode:::has_custom_fields(geocoder)

geocode_addresses(
  "Old Bedford Rd, Luton LU2 7HA, United Kingdom",
  geocoder = geocoder
)

I'll figure this out shortly and push a patch to GitHub. I'll let you know when it's available

aaronkrusniak commented 5 months ago

Thanks so much @JosiahParry !

JosiahParry commented 5 months ago

@aaronkrusniak Assuming CI is all green I'll merge this in and it should be fixed! https://github.com/R-ArcGIS/arcgisgeocode/pull/21

The binaries should be built about an hour-ish after merging from r-universe. Or, you can install directly from github if you have rust available on your machine.

JosiahParry commented 5 months ago

Merged! You should be able to install it from https://r-arcgis.r-universe.dev/arcgisgeocode in about an hour. Thank you so much for reporting this issue.

Lmk if I can help at all or if you ever want to hop on a call and work through a problem that our packages(may or may not) help with.

aaronkrusniak commented 5 months ago

@JosiahParry thank you so much! I'll give it a spin and let you know if I run into any more strange behavior :)

ryan-honeck commented 5 months ago

I'm having the same issue with a custom geocoder where I receive the error "Error in parse_locations_res(string, use_custom_json_processing) : argument "geocoder" is missing, with no default".

I'm using arcgisgeocode version 1.01.306.

JosiahParry commented 5 months ago

@ryan-honeck that is an arcgisbinding version. Can you try installing the latest build from R-universe? https://r-arcgis.r-universe.dev/arcgisgeocode