How to get match ISSUES from GBIF

peterdesmet commented 5 years ago

I noticed that the issues we collect in get_taxa.Rmd are issues with the backbone taxon, not issues of the checklist taxa:

backbone taxon: http://api.gbif.org/v1/species/4812260: ORIGINAL_NAME_DERIVED
unrelated checklist taxon: https://api.gbif.org/v1/species/141264633: RANK_INVALID

We get the ORIGINAL_NAME_DERIVED in our issues, which isn't useful for us. Unfortunately, the lookup function doesn't seem to return issues as a column:

alien_plants <- rgbif::name_lookup(
  datasetKey = "9ff7d317-609b-4c08-bd86-3bc404b77c42",
  origin = "source",
  limit = 99999,
  return = "data"
)
colnames(alien_plants)

 [1] "key"                 "scientificName"     
 [3] "datasetKey"          "nubKey"             
 [5] "parentKey"           "parent"             
 [7] "kingdom"             "family"             
 [9] "kingdomKey"          "familyKey"          
[11] "canonicalName"       "nameType"           
[13] "taxonomicStatus"     "origin"             
[15] "numDescendants"      "numOccurrences"     
[17] "taxonID"             "habitats"           
[19] "nomenclaturalStatus" "threatStatuses"     
[21] "synonym"             "species"            
[23] "speciesKey"          "rank"               
[25] "genus"               "genusKey"

damianooldoni commented 5 years ago

A solution could be using name_usage() instead of name_lookup():

alien_plants_complete <- name_usage(
  datasetKey = "9ff7d317-609b-4c08-bd86-3bc404b77c42", 
  return = "data", 
  limit = 99999) %>% 
  filter(origin == "SOURCE")  #origin is not an argument of name_usage()
colnames(alien_plants_complete)

 [1] "key"                 "nubKey"              "nameKey"            
 [4] "taxonID"             "kingdom"             "family"             
 [7] "kingdomKey"          "familyKey"           "datasetKey"         
[10] "parentKey"           "parent"              "scientificName"     
[13] "canonicalName"       "authorship"          "nameType"           
[16] "origin"              "taxonomicStatus"     "nomenclaturalStatus"
[19] "numDescendants"      "lastCrawled"         "lastInterpreted"    
[22] "issues"              "synonym"             "species"            
[25] "speciesKey"          "rank"                "genus"              
[28] "genusKey"

damianooldoni commented 5 years ago

This issue has an implication in composition of verification spreadsheet. See here: https://docs.google.com/a/inbo.be/document/d/17TpZNcokL3NW5s9hkRAgvS6hyg6OhyfSFQEZ9xf7-No/edit?disco=AAAACBJyasU

peterdesmet commented 5 years ago

Does name_usage work with a vector of datasetKeys?

damianooldoni commented 5 years ago

No, but we can easily solve this problem by using map_df from purrr:

map_df(checklists_metadata$datasetKey, ~ 
  name_usage(datasetKey = ., limit = 99999, return = "data")
) %>% 
  filter(origin == "SOURCE")

or explicitly:

map_df(checklists_metadata$datasetKey, function(x) 
  name_usage(datasetKey = x, limit = 99999, return = "data")
) %>% 
  filter(origin == "SOURCE")

peterdesmet commented 5 years ago

Done in cd4f8dc133d4df4351eafb2cb82fb0b70f071888.

trias-project / unified-checklist

How to get match ISSUES from GBIF #13