Open zachary-foster opened 7 years ago
this might cause a problem when one genus names is used for multiple genera.
does this mean many genera in different higher taxonomic groups? e.g., the same genus name in plants and animals?
Yea, like "Achlya", which is a moth and an oomycete.
possible to allow user to give a higher taxon group ?
That would work well when the user is only looking at one Kingdom, which I expect is usually the case. I think that would have to be handled by taxize
though, since if there are multiple matches, only one can be returned. If there was a variant on taxize::classification
that returned the taxonomy of all matches, perhaps with an additional column of numbers to specify which match each rank belonged to, then that could be implemented in taxa
.
If there was a variant on
taxize::classification
that returned the taxonomy of all matches, perhaps with an additional column of numbers to specify which match each rank belonged to
there is the concept in taxize::get_*
functions to get all results when > 1 result - and not go through the prompt - but taxize::classification
has no equivalent.
though perhaps this is close enough: use get_*
to get any number of taxon IDs, then pass to classification
, optionally bind classifications together
or does that not do what's needed for this?
I think that would work. Could add an option to taxize::classification
named all_matches = TRUE
that returns that output format. Probably should always have that extra column if TRUE
, even if only one match is returned so the format is consistent.
Which thing would work?
there is the concept in taxize::get_* functions to get all results when > 1 result - and not go through the prompt - but taxize::classification has no equivalent.
Sorry for the vagueness. Looking at the code, it looks like taxize::get_*
are called when a taxon name is supplied and they handle the prompting of the user. An argument could be added to classification
that allows for multiple returns from taxize::get_*
without a prompt, looks up the classification of each, and then rbinds them together with an extra column with the matching taxon ID.
Thanks for clarification.
Sounds good up to the point of rbinding together - that would mean a departure from the output format in other cases - and I'd rather not have variable output formats
opened an issue in taxize
to explore there https://github.com/ropensci/taxize/issues/628
- and I'd rather not have variable output formats
Yea, variable output formats are not great
Often, there are many species in a data set sharing a genus. When looking up the taxonomy from taxon names, it is inefficient to use the full name in many cases since that taxonomy of many species can be inferred from a single query of the genus. Also, sometimes a genus is in a database while a species is not. However, this might cause a problem when one genus names is used for multiple genera.