Closed dfalster closed 1 year ago
Some variable names to consider renaming / make more consistent:
taxonomic_ref
, taxonomic_reference
- reference APC
, APC accepted
, APC known
, APNI
. This is not a term in DarwinCore, where the references
are the primary taxonomic literature references (NamingAuthority
). In NSL it is called dataset
(too ambiguous). I'm happy with taxonomic_reference
, which is what we use in traits.build. From traits.build schema: taxon_rank: The taxonomic rank of the most specific name in the scientific name.
taxonRank
, taxon_rank
, taxonomic_resolution
- this is the rank of a name. it is, in traits.build terms, also the resolution to which a name can be aligned, which is the conflict. They are slightly different, but I think it is fine to go with taxon_rank
. This is an official term in DarwinCore and NSL. In traits.build, we use both. taxonomic_resolution
within the taxonomic_updates
table, defined as The rank of the most specific taxon name (or scientific name) to which a submitted orignal name resolves.
. Then in the taxon table, it is instead, taxon_rank
, defined as The taxonomic rank of the most specific name in the scientific name.
binomial
, trinomial
- These terms, during matching at least, are slight mis-nomers, because they are actually simply "first two words" and "first three words", once all filler words (sp, spp, var, form, etc) are removed by strip_names_2()
. There are lots of phrase names (i.e. Species level) that are aligned to the trinomial column. But I'm also happy to keep what we have, because in the traits.build output, the actually output only fills in the columns if the taxon_rank
matches; it is just the intermediary matching step where the columns don't perfectly match the expected taxon_rank
ID
as described in issue #117 should be split into scientific_name_ID
and taxon_ID
for a number of reasons. Most importantly, 1) names
and taxa
are distinct concepts and 2) scientific_name_ID
provides the link between APC & APNI scientific names.In terms of input parameters for the functions, I think they are largely consistent - the exception is that the term original_name
is used in align_taxa
versus taxa
in match_taxa
and create_taxonomic_update_lookup
. I can't decide if I think these should be the same term or not.
The first argument for
align_taxa
is calledoriginal_name
, while forcreate_taxonomic_update_lookup
it'staxa
.Argument name need to be consistent across the main functions:
align_taxa
is called ,create_taxonomic_update_lookup
andupdate_taxonomy
, also with outputsFor this argument I suggest
original_name
is better.