Time to revisit and solidify the user interface for the package.
get_ids() works ~ as it does in taxize, returning a vector of IDs of the same length as the vector of names it is given. This works nicely with mutate() to add an ids column. Unlike taxize, the return format is consistent over all providers. Non-uniquely-matched names as well as unmatched names return as NA.
ids() Currently a function for doing a filtering join of scientificNames against the ids. Needs a better name that sounds like a verb or indicates that it returns a table.
get_names() the inverse function to get_ids(), doesn't exist yet but should.
synonyms() A helper utility to return all the synonyms of a queried name, using repeated rows for each new synonym of a given accepted name, like so:
input
acceptedNameUsage
synonym
acceptedNameUsageID
spatula cyanoptera
Spatula cyanoptera
Anas cyanoptera
IUCN:22680233
antrostomus vociferus
Antrostomus vociferus
Caprimulgus vociferus
IUCN:22736393
antrostomus arizonae
Antrostomus arizonae
Caprimulgus vociferus
IUCN:22736398
antrostomus arizonae
Antrostomus arizonae
Caprimulgus arizonae
IUCN:22736398
trochilid
NA
NA
NA
It's not entirely clear to me that this wider format is better format than filtering the original dwc format on the accepted ids, e.g:
scientificName
taxonomicStatus
acceptedNameUsageID
taxonRank
taxonID
Spatula cyanoptera
accepted
IUCN:22680233
species
IUCN:22680233
Antigone canadensis
accepted
IUCN:22692078
species
IUCN:22692078
Antrostomus vociferus
accepted
IUCN:22736393
species
IUCN:22736393
Antrostomus arizonae
accepted
IUCN:22736398
species
IUCN:22736398
Anas cyanoptera
synonym
IUCN:22680233
species
NA
Grus canadensis
synonym
IUCN:22692078
species
NA
Caprimulgus vociferus
synonym
IUCN:22736393
species
NA
Caprimulgus vociferus
synonym
IUCN:22736398
species
NA
Caprimulgus arizonae
synonym
IUCN:22736398
species
NA
descendants() Currently Like ids but takes name+rank (class="Aves") as the filter. Also currently overloaded so it can filter on a list of ids, but that role should be taken by a separate function. (also, currently that assumes taxonID, while it sometimes makes more sense to match to acceptedNameUsageID instead).
common_names() not implemented
clean_names() operates on a vector of names, performs a series of transformations by default. Should probably have the more aggressive of these (binomial) be opt-in instead of opt-out. May also need more methods to drop things that are abbreviations, digits, non-alpha chars, or given in () or [] etc.
Time to revisit and solidify the user interface for the package.
get_ids()
works ~ as it does intaxize
, returning a vector of IDs of the same length as the vector of names it is given. This works nicely withmutate()
to add an ids column. Unliketaxize
, the return format is consistent over all providers. Non-uniquely-matched names as well as unmatched names return asNA
.ids()
Currently a function for doing a filtering join ofscientificName
s against the ids. Needs a better name that sounds like a verb or indicates that it returns a table.get_names()
the inverse function toget_ids()
, doesn't exist yet but should.synonyms()
A helper utility to return all the synonyms of a queried name, using repeated rows for each new synonym of a given accepted name, like so:It's not entirely clear to me that this wider format is better format than filtering the original dwc format on the accepted ids, e.g:
descendants()
Currently Likeids
but takes name+rank (class="Aves") as the filter. Also currently overloaded so it can filter on a list of ids, but that role should be taken by a separate function. (also, currently that assumes taxonID, while it sometimes makes more sense to match to acceptedNameUsageID instead).common_names()
not implementedclean_names() operates on a vector of names, performs a series of transformations by default. Should probably have the more aggressive of these (binomial) be opt-in instead of opt-out. May also need more methods to drop things that are abbreviations, digits, non-alpha chars, or given in
()
or[]
etc.