Using the function gnr_resolve(), I never obtain the same matched name for multiple user-supplied names - even when doing so would lead to a clearly better match. These erroneous matches persist even in single-species gnr_resolve()queries.
Evidently, the best match for Lagopus matu (first row in the output) should be Lagopus muta as has been matched correctly in row four. Additionally, the matches to Lagopus lagopus (row 3) and Lagopas lagopus (row 5) ought to be the same - Lagopus lagopus.
Interestingly, even when running the gnr_resolve()function only on just the first species:
gnr_resolve(sci = sps[1], best_match_only = TRUE)
still results in the same erroneous match as above:
# A tibble: 1 × 5
user_supplied_name submitted_name matched_name data_source_title score
* <chr> <chr> <chr> <chr> <dbl>
1 Lagopus matu Lagopus matu Lagopus Brisson, 1760 Catalogue of Life Che… 0.75
Workaround
For now, I have put together a workaround with the rgbif package:
library(rgbif)
Fixed_Species <- sapply(sps, # loop over species names
FUN = function(x){
gbif_resolve <- rgbif::name_backbone_verbose(x) # retrieve gbif backbone matches
ifelse(gbif_resolve$data$matchType != "NONE",
gbif_resolve$data$canonicalName[1], # if match has been made, then pull matched canonical name
gbif_resolve$alternatives$canonicalName # if no match, then pull out alternative matches from fuzzy matching
)
}
)
The Issue
Using the function
gnr_resolve()
, I never obtain the same matched name for multiple user-supplied names - even when doing so would lead to a clearly better match. These erroneous matches persist even in single-speciesgnr_resolve()
queries.Minimal Working Example
Running this code:
results in this output:
Evidently, the best match for Lagopus matu (first row in the output) should be Lagopus muta as has been matched correctly in row four. Additionally, the matches to Lagopus lagopus (row 3) and Lagopas lagopus (row 5) ought to be the same - Lagopus lagopus.
Interestingly, even when running the
gnr_resolve()
function only on just the first species:still results in the same erroneous match as above:
Workaround
For now, I have put together a workaround with the
rgbif
package:which, to me, leads to the expected matches:
Session Info
```r R version 4.3.2 (2023-10-31) Platform: x86_64-apple-darwin20 (64-bit) Running under: macOS Sonoma 14.1 Matrix products: default BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib LAPACK: /Library/Frameworks/R.framework/Versions/4.3-x86_64/Resources/lib/libRlapack.dylib; LAPACK version 3.11.0 locale: [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8 time zone: Europe/Oslo tzcode source: internal attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] taxize_0.9.100 loaded via a namespace (and not attached): [1] bold_1.3.0 gtable_0.3.4 jsonlite_1.8.7 crayon_1.5.2 [5] rgbif_3.7.7 dplyr_1.1.2 compiler_4.3.2 tidyselect_1.2.0 [9] Rcpp_1.0.11 xml2_1.3.4 stringr_1.5.0 parallel_4.3.2 [13] scales_1.2.1 uuid_1.1-1 lattice_0.21-9 ggplot2_3.4.3 [17] R6_2.5.1 plyr_1.8.8 generics_0.1.3 curl_5.0.2 [21] oai_0.4.0 iterators_1.0.14 tibble_3.2.1 crul_1.4.0 [25] munsell_0.5.0 pillar_1.9.0 rlang_1.1.1 utf8_1.2.3 [29] httpcode_0.3.0 stringi_1.7.12 lazyeval_0.2.2 cli_3.6.1 [33] magrittr_2.0.3 foreach_1.5.2 digest_0.6.31 grid_4.3.2 [37] rstudioapi_0.15.0 lifecycle_1.0.3 nlme_3.1-163 vctrs_0.6.3 [41] glue_1.6.2 data.table_1.14.8 whisker_0.4.1 zoo_1.8-12 [45] codetools_0.2-19 ape_5.7-1 fansi_1.0.4 colorspace_2.1-0 [49] conditionz_0.1.0 httr_1.4.7 tools_4.3.2 pkgconfig_2.0.3 ```