Open Australis86 opened 2 years ago
True, @thomasstjerne we should mark ambiguous synonyms in the taxon pages.
I have to admit I am not a fan of using that status for cases like that though. The 2 names Hedyosmum arborescens Griseb.
and Hedyosmum arborescens Cordem. ex Baill.
are clearly different having a very different authorship. @dhobern @chantalhuijbers I would want to raise this in the taxonomy group and consider to use ambiguous synonym
only for pro parte names. And maybe even consider to use pro parte synonym
instead.
@mdoering I believe that this has been fixed in the natural way - i.e. by ensuring that the taxon pages correctly show when a relationship is considered partial synonymy, regardless of which end is viewed. This makes sense, independently of whether or not the status is being correctly applied.
That leaves just an issue of whether we need to clarify expectations around the use of this status. I'm not sure I understand what you wish the taxonomy group to discuss. Does this example from Hedyosmum represent a common pattern that we need to consider more carefully?
Thanks. Yes, the definition and expected use of the status ambiguous synonym
is what I am after. Unless we agree already on a clear definition it might be worth throwing this out for discussion in the taxonomy group. On the portal we have the following definition:
A name that has been used to refer to more than one possible species
The API has the following definition:
Names which are ambiguous because they point at the current species and one or more others e.g. homonyms, pro-parte synonyms (in other words, names which appear more than in one place in the Catalogue).
I think the question mostly boils down to what is a name. Does it include the authorship to be considered the same or not? My understanding has been we use this status for pro parte synonyms, i.e. names (with authorship) pointing to multiple accepted names, to warn users. If that is the case we can should be able to determine the status automatically even. The Hedyosmum arborescens case given above is not a pro parte name.
Looking at the current ambiguous synonyms in COL:
@dhobern, @yroskov if we can agree on the stricter and more precise pro parte definition I think we don't need any further discussion. But I sense from the examples above it is not that clear?
Side note - calling it checklist status in the glossary is also worth debating. This is a term we don't use anywhere in the API, ColDP or UIs. There we usually just have "status" in the context of a Name or a Taxon/Synonym. In DwC it is taxonomicStatus and nomenclaturalStatus. In the ColDP docs we also use nomenclatural or taxonomic status:
status: is the taxonomic name usage status which includes
Synonym.status
and theTaxon.provisional
flag. A provisional taxon should be listed asprovisionally accepted
. Unresolved names which are neither accepted nor synonyms can be listed with status=bare name
in which case only the Name properties are relevant. This corresponds to a lone Name record without a Taxon or Synonym record.
And should we not better reuse the API definitions dynamically in the glossary instead of maintaining a different definitions there?
I think the question mostly boils down to what is a name. Does it include the authorship to be considered the same or not?
According to the Codes, an authorship isn't a part of scientific name ;) . A format of authorstring (author name(s), delimiters, year, abbreviated citation, nomenclatural comment) mainly defines by editorial practices, but not by the Codes.
ICZN (species):
5.1. Names of species
The scientific name of a species, and not of a taxon of any other rank, is a combination of two names (a binomen), the first being the generic name and the second being the specific name. The generic name must begin with an upper-case letter and the specific name must begin with a lower-case letter.
51.1. Optional use of names of authors
The name of the author does not form part of the name of a taxon and its citation is optional, although customary and often advisable.
Botanical Code (species): 6.7. The name of a taxon below the rank of genus, consisting of the name of a genus combined with one or two epithets, is termed a combination (see Art. 21, 23, and 24). 23.1. The name of a species is a binary combination consisting of the name of the genus followed by a single specific epithet...
Taxon author is a part of citation: see CITATION, SECTION 1 AUTHOR CITATIONS https://www.iapt-taxon.org/nomen/pages/main/art_46.html
CoL Ambiguous Synonym status is not equal to pro parte synonyms. It includes homonyms, pro-parte synonyms, as well as possible [unresolved yet] mistakes in the source checklist. This status is a warning flag for CoL users who are not taxonomists and don't paying attention to authorstrings, nomenclatural statuses and abbreviated comments usually used in authorstrings.
Small set of generalized CoL/checklist statuses is an essential part of the CoL integrity.
@yroskov I feel we are conflating two very different things then:
I would propose to use a distinct status for the 2 situations. In fact we could separate out the boolean provisional flag completely and make that a separate property that applies to any of the remaining statuses: accepted
, synonym
, ambiguous synonym
, misapplied
. That would result in a rather clean list of pure statuses, I have never seen ambiguous synonym being used outside of the COL checklist.
This seems sensible to me. The significance of the two situations for a user is very different.
I have no objection for distinct statuses in these 2 situations in the case of extended catalogue. All names, which you add programmatically outside and, especially, inside GSDs, should get status "Unresolved".
Who and how will separate these statuses in the case where "duplicated" names appear inside the GSD? Only GSD authors/taxonomists can do it professionally. Are they ready to follow your proposal? Not sure. They are building their checklists follow own aims and rules.
If there are duplicates the ambiguous status would be falling under the first category which remains as it was. The only situation we are considering to be changed is for all the non duplicate names that currently have the ambiguous synonym status. E.g. Abelmoschus moschatus subsp. tuberosus. When I look into the workbench there is an editorial decision to apply the ambiguous status, but there is no other name like that, neither in COL nor World Plants. Is this just an error stemming from an older version when there maybe have been more copies?
If your definition of ambiguous synonym always requires at least 2 copies of a name there is no need for the unresolved case. And we could actually add some checks that look for outdated decisions applying ambiguous to just one copy.
Yes, ambiguous synonym always requires at least 2 copies of a name.
If a single name has ambiguous synonym status in CoL, it might be caused by two cases:
(1) old decision from previous version incorrectly stays in CLB after duplicated name been resolved in GSD. Abelmoschus moschatus subsp. tuberosus is exactly the case: http://www.catalogueoflife.org/annual-checklist/2019/search/all/key/Abelmoschus+moschatus+tuberosus/fossil/1/match/1
Solution: It would be nice, if CLB reported cases of "ambiguous synonym" with a single name as a broken decision. (We already have report on broken decisions in Tasks, but we need to have a separate report with these cases).
(2) decision was applied inside whole dataset, whereas CoL takes only part of bigger dataset (ITIS, WCVP).
Solution: It would be nice, if CLB generates TASKS reports inside the project for only sectors included in the CoL, but not for entire GSD
In that case lets stick with what we got and provide better reports on the project as you say. To spot outdated decisions on single names: https://github.com/CatalogueOfLife/checklistbank/issues/1333
Decisions based on the entire source dataset which are only copied in parts to the project should show up through the outdated decisions above already. To work on the project as a whole you can already use the duplicate tool and project wide tasks. It is a bit slow cause they are large, but work fine and seem to need some attention: https://www.checklistbank.org/catalogue/3/tasks
So far, the most interesting issue is a true overlap which I introduced with recent update of WCVP.
@mdoering, it's strange: there is an obvious duplication of species lists in two plant genera Hydrocotyle and Trachymene in the species list. But there are no these genera in a list of duplicated genera. Why?
Describe the problem: If I search for taxon 1 (e.g. Hedyosmum arborescens) and it is an ambiguous synonym of taxon 2 (e.g. Hedyosmum grisebachii), the COL website search correctly lists this; e.g.:
However, if I search for taxon 2, it does not show on its page that taxon 1 is ambiguous - there is no delineation between normal or ambiguous.
If I use the namesearch API to get taxon 2 (Hedyosmum grisebachii, taxon ID = 3JZVL) and then the taxon API to get the synonyms, there is no indication in the resultant dataset that taxon 1 (Hedyosmum arborescens) is an ambiguous synonym.
However, if I use the Namesearch API to retrieve taxon 1 (Hedyosmum arborescens), then it correctly return the same three results as per the webpage, showing that it is an ambiguous synonym of taxon 2.
Is this an API issue or a dataset issue? Can we have it so that the classification is consistent and regardless of which of the two taxa is searched for, it shows when it is an ambiguous synonym of the other?
Link to effected CoL webpages: https://www.catalogueoflife.org/data/taxon/3JZVL https://www.catalogueoflife.org/data/taxon/3JZWK