tdwg / tnc

Taxonomic Names and Concepts Interest Group
22 stars 7 forks source link

property:{TO BE NAMED} to indicate the novel status of a taxon in a publication #118

Open afuchs1 opened 3 years ago

afuchs1 commented 3 years ago

I have been trying to map the new TCS terms to a ingesting a bibliographic resource, the accepted taxonomic name (accepted TNU) and any synonyms (related TNU's) and cannot find anywhere to map something to indicate this taxonomic name is being published for the first time. ICN : Recommendation 45A 45A.1. A new name should be followed by a direct citation indicating its novel status, including the word "novus" (-a, -um) or its abbreviation, e.g. genus novum (gen. nov.), species nova (sp. nov.), combinatio nova (comb. nov.), nomen novum (nom. nov.), or status novus (stat. nov.).

Example : Calytrix insperata Rye sp.nov in https://florabase.dpaw.wa.gov.au/nuytsia/article/939

mdoering commented 3 years ago

That is highly linked to designating the protonym or a combonym as Rich proposes in #46

deepreef commented 3 years ago

Yeah, I was thinking along those lines. But there is a subtle but important distinction: Protonyms are the first chronoloogical use of a name, not necessarily the first Code-compliant establishment of a name. The right way to do this is with a 1:M relationship to something like "nomenclaturalEvents" (term we came up with to apply Code-relevant actions to TNUs). But I wasn't sure if that's a box we're ready to open yet, and also might be more relevant to implementations, rather than exchange standards. But don't we have something along the lines of dwc:nomenclaturalStatus in TCS? I'd check, but am just ending my day here, and brain too tired to think about that stuff in too much detail.

deepreef commented 3 years ago

Also, the equivalent of dwc:taxonomicStatus for the taxonomic qualifiers.

mdoering commented 3 years ago

In the COL API I have a field nomenclaturalNote and taxonomicNote which holds all the verbatim bits found in the authorship string of names s.l. I found that very useful to deduce the actual status but verbatimly represent a given full name

nielsklazenga commented 3 years ago

@afuchs1, what you are looking for is the inverse of namePublishedIn. Of the names that are published in the publication, the ones that have neither basionym nor replacedName (or 'ReplacementNameFor' in TCS 1) are new taxa ('tax. nov.'), the ones with a basionym are new combinations ('comb. nov.') and the ones with a replacedName avowed substitutes ('nom. nov.'). If the name has a different rank than its basionym, it's a 'stat. nov.'.

As @deepreef already suggests, we can also add these terms to the vocabulary of nomenclaturalStatus (TCS 1 'PublicationStatus'), as subtypes of 'valid' (the botanical 'valid').

ghwhitbread commented 3 years ago

In the NSL we capture these status values within the vocabulary (for our equivalent) of TNU-type. So much of how we consume, store, use, transform and disseminate TNU data depends on this unambiguous typing.

Both nomenclaturalStatus (standing in nomenclature) and taxonomicStatus (standing within a classification) are properties of a Name, with values derived independently of the act (and intention) of publication.

This is important stuff ... A robust and extensible type-vocabulary for the TNU could well be our most important contribution.

deepreef commented 3 years ago

This is important stuff ... A robust and extensible type-vocabulary for the TNU could well be our most important contribution.

I completely agree! But we'll need a system that allows more than one "type" to be applied to a TNU, as many of them are not mutually exclusive.

@ghwhitbread : you may recall one of those NOMINA meetings in Hawaii, where we were in the fish collection office with a whiteboard along with the usual NOMINA suspects (Paul, Nicky, Paddy, etc. -- @mdoering, were you at that one?) and we diagrammed out the idea of "NomenclaturalEvents" (ways of "tagging" TNUs with specific Code-relevant "events" to characterize things like Nomenclatural Acts). We never really implemented it, but I plan to implement it in a big way for the next-gen ZooBank. So perhaps it would be worth visiting now. See the attached diagram, which was the outcome of that NOMINA discussion. TaxonNameUsageCluster

afuchs1 commented 3 years ago

I am just looking at this from the point of the view of the data to be moved around, in this use case - what we need to ingest from a publication (how do we identify it as the publication event for a new taxonomic name? and hope I've used the right words here).

From the data we work with (which I accept may be skewed) I see this as a property of the act of publishing a bibliographic resource for a 'new' taxonomic name which is expressed as the accepted taxonomic name usage in the set of TNU's in that resource.

BR: Nuytsia _Calytrix insperata_ (Myrtaceae: Chamelaucieae), a new Western Australian species opportunistically discovered on vacation
---
Has TNU1 : acceptedName **_Calytrix insperata_** Rye, is published for the first time as **sp.nov**  
has TN : _Calytrix insperata_ Rye
---
Has TNU2 : has synonym _Calytrix _sp._ Kennedy Range (A. Markey & S. Dillon 6301) of accepted name _Calytrix insperata_
has TN : Calytrix _sp._ Kennedy Range (A. Markey & S. Dillon 6301)
--

From this point of view the attribute of 'sp.nov' is specific to this name in this publication event. Subsequent events will determine the validity of the status of the name. ie. whether it is invalid etc - this status is then a property of the name (nomenclaturalStatus)

Are the nomenclaturalEvents the intersection between the taxonomic name of interest and the TNU's in different BibliographicResources?

Excuse my lack of knowledge, but are changes to the nomenclatural status of the name (valid, invalid etc) 'documented' via another publication event or is the status assigned as part of assessing against the relevant nomenclatural act and so the name becomes 'invalid' without a published reference? Thinking about it I presume people don't necessarily agree whether a name is invalid...?

In BR1 --> TNUa (sp.nov) --> for TN1 In BR2 --> TNUb says --> TN1 is invalid therefore TN becomes (nom.illeg)

Please note, this use case is about extracting data from publications, I would like to see if the nomenclatural and taxonomic information contained in them can be expressed using the TNC model.

mdoering commented 3 years ago

thanks @afuchs1 this is very helpful. Even if nomenclatural data wants to be objective and rule based there might be different opinions and for sure wrong evaluations of rules or simple data errors. I would think like with other data the different opinions are mostly covered by having distinct datasets. But when one wants to capture exactly what was published and aggregate these statements we have to deal with concurrent opinions on the same "name". Subsequent publications could assert some other nomenclatural status, but I guess in many cases the status is nowhere published in literature. I would think IPNI evaluates a status for example. @deepreef does ZooBank evaluate or capture a status? I think I never saw that. The example of Cancer strigosus given in ICZN as a homonym example is in ZooBank, but none is marked as such.

deepreef commented 3 years ago

@mdoering :

does ZooBank evaluate or capture a status? The data model supports it (via NomenclaturalEvents), but we have not yet implemented this on the ZooBank website. This will definitely be part of the "next generation" ZooBank, after we can get some funding. The Commission has for a very long time (since before ZooBank was even created) discussed how we would add a "verification" layer confirming the nomenclatural status of each name, as established through TNUs (including Protonyms). But it would be impossible for the Commission itself to do that work for so many names. Some nomenclators have already done the work, so certainly we would want to accept those status values when they already exist. But the only way to ascribe such status with confidence to names will be to mobilize the army of existing ZooBank users to crowdsource it.

It's all very doable -- just needs some policies, a reasonable workflow, and a clean UI.

afuchs1 commented 3 years ago

thank you @mdoering and @deepreef - trying to summarise so far

It also does not map to https://dwc.tdwg.org/list/#dwc_taxonomicStatus nor to the draft biocode (http://www.plantsystematics.org/reveal/pbio/nomcl/mcnet3.html)

so going full circle, no resolution.....so if we are ingesting a publication of a new taxa (via a bibliographicResource) where should this data be captured? Still hopeful the TCS model will accommodate this use case and not require another standard.

mdoering commented 3 years ago

My immediate reaction would be to place the novum status on the TN as it comes in from the TNU. The problem only arises if the TN is normalised; shared between several TNUs and thus can only have a single status. In that case sp.nov. is not appropriate.

As the status is an assertion we might also be interested in who asserted it when and allow for multiple assertions. That sounds an awful lot like a TNU and the nomStatus might be needed on a TNU. Also how would you represent faithfully a publication that asserts a name is a nom.illeg. but which might turn out to be wrong? Thats the crux of strongly normalized data. But isn't this problem the same other TN properties like the scientific name? The exact spelling of the name could be different from TNU to TNU. This makes me wonder how useful the separation between TN and TNU really is. In my work on COL ChecklistBank which collates various datasets I have troubles finding the separation very useful as you need to mint separate TNs for every dataset anyways in order to faithfully represent what the name is in each list. And at that stage I could as well merge all TN properties into TNU and have a simpler model to work with.

deepreef commented 3 years ago

If I understand this discussion correctly, it underscores why I am distrustful of a class of "thing" representing a TN. There is no single definition of what such a thing would be, and as far as I know, no clear proposal for a definition (i.e., text string, or abstract object? Individual components, or full combination? Misspellings are the "same" name, or different? etc...) This is why I keep emphasizing that the path to salvation in modelling taxonomic information is to anchor everything to TNUs. Forget TN as a separate class -- they seem necessary at first, but in the long run (and even in the short run) add unnecessary complexity to tracking the information we care about. What @mdoering describes above is exactly in line with what I've been trying to preach on this stuff: in short: TNs do not exist except in the context of TNUs. Everything nomenclatural about TNs derives from TNUs.

Taxon concepts are abstract circumscriptions of organisms. Outside of TNUs, TNs are linked to these concepts/circumscriptions only at name-bearing type. Everything taxonomic about concepts/circumscriptions derives from TNUs.

So... everything we care about for both nomenclature and concepts/circumscriptions are represented through sets of TNUs and their key properties.

deepreef commented 3 years ago

I failed to answer the question from @afuchs1 :

so if we are ingesting a publication of a new taxa (via a bibliographicResource) where should this data be captured? Still hopeful the TCS model will accommodate this use case and not require another standard.

The only way I think we can manage this in a scalable way is to define a class for "NomenclaturalEvent" -- a layer of information that connects instances of TNUs with nomenclatural properties (such as 'sp. nov.', 'comb nov', lectotypifications, first-reviser actions, and whatever other Code-specific actions or events bestow nomenclatural status upon these things we like to think about as "taxonomic names". A number of us (including @ghwhitbread and others from the botanical world) hashed this out several years ago (see the ER diagram above) -- we just need to encode it within a data standard (like TCS 2), and implement it in our data systems.

ghwhitbread commented 3 years ago

The simple solution is an additional property, TaxonNameUsageType, along with an associated, and extensible, type vocabulary. Because the TNU all the way down approach requires that each TNU be typed.

When “names” come into being, each nomenclatural act must manifest as a TNU - it is not a Name otherwise - with their types declared (or implied) on publication (tax.nov.[variously], comb.nov., nom.nov., stat.nov., etc. ). This typing tells us what is expected in terms of TNU properties, and relationships.

Then, depending on your model, there are many more TNU types:

If everything is a TNU we do need the means to tell them apart!

deepreef commented 3 years ago

@ghwhitbread : I think you're right that we need a system of classifying TNU "types". However, I'm not sure the ones you list are mutually exclusive. When the "type" of TNU is self-evident from it's properties, then it doesn't need to be redundantly indicated by a TNU "type" designation. For example, when taxonomicnameUsageID != acceptedUsageID, then it's self-evidently a heterotypic synonym. Homotypic synonymy is self0evident in the form of "every other combination of the same protonym with the parentUsageID is a different protonym from the current record's parentUsageID" The status of 'accepted' is self-evident when taxonomicnameUsageID != acceptedUsageID. And so on...

Where I do think we need to track something like taxonomicNameUsageType is for distinguishing things like TNUs representing robust treatments vs. TNUs representing only mentioning the name in passing.

Also, we need to nail down some definitions of things -- especially terms like "misapplication". The only definition of that term that makes sense to me is when someone asserts a circumscription associated with a name, where the person asserting the circumscription would him/herself not have included the name-bearing type within that circumscription. In other words, when someone applies a name to a circumscription based on a misunderstanding of what the type of the name is. These are very rare, and very difficult to objectively document. In most/all other contexts, there little or no objective distinction between "misapplication" and "taxonomic disagreement".

nielsklazenga commented 3 years ago

@afuchs1 and I have been looking at it together and think this is covered by the TaxonConcept@type (DataSet/TaxonConcepts/TaxonConcept/@type) attribute in TCS 1 (#107). tax. nov' (andsp. nov. etc.) fall under original, while comb. nov., nom. nov. and stat. nov. fall under, and could be subcategories of, revision.

deepreef commented 3 years ago

Are there concerns about mixing "types" that are nomenclatural vs. "types" that are taxonomic? One of the challenging issues is that certain "types" that are taxonomic under one Code, might be nomenclatural under another (e.g., comb. nov.). This is why we came up with the idea of nomenclaturalEvent, where different event types can be defined for different Codes. This also has the benefit of managing ambiregnals effectively.

ghwhitbread commented 3 years ago

@nielsklazenga & @afuchs1 : I hope you’re not suggesting that we obfuscate these data elements using the generic, TCS 101 vocabulary. That dumbing down was a significant factor in the lack of uptake. Besides, and ironically, that specific taxon concept thinking is not compatible with all of our nomenclatural and taxonomic data interchange needs. Am I mistaken in the belief that we were adopting a broader TNU approach, along with the requirement for a richer type vocabulary.

@deepreef: This is not about mixing types. That idea is orthogonal to the need for a system for TNU typing. The nomenclatural and taxonomic events and opinions that constitute a TNU’s context are independent of the factual purpose, stated or implied, for a name within a work. In this sense the property TNU-type is simply a descriptor for each usage. Whether the TNU establishing a new generic combination is recognised as a nomenclatural event, or not, does not change the fact that the purpose of the TNU was to make that combination - in any code.

deepreef commented 3 years ago

Thanks, @ghwhitbread ; but I have to quibble with this:

Whether the TNU establishing a new generic combination is recognized as a nomenclatural event, or not, does not change the fact that the purpose of the TNU was to make that combination - in any code.

It depends on what you mean by "purpose". Yes, in many cases, when zoological species names are combined with a different genus name for the first time, the authors were aware they were making this "change". And sometimes they even state it explicitly. But in many other cases, authors may be unaware that they happen to be the first publication to establish a new combination. While sometimes authors state "comb. nov." explicitly, I bet across the history of zoological nomenclature the majority of cases the authors do not explicitly make this statement. And if they don't explicitly make the statement, we have no way of knowing whether they intended (or were aware; i.e., had "purpose") that they were the first to publish that particular combination.

Sure, when authors assert "comb. nov.", we should record this assertion. But I still see it as an assertion, not a fact. Because in Zoology this is not a Code-governed act, authors have not paid as close attention to it, so there will be many cases of asserted "comb. nov." that actually aren't; and many (MANY) more cases of actual "comb. nov." TNUs that are not asserted as such.

Moreover, as I said previously, if there is a single "Type" property for a TNU, then the allowable values should be strictly mutually exclusive. If there is ever a need to record this property as "x and y" (more than one "type" applies), then we should move instead towards a nomenclaturalEvent structure to capture potentially more than one value of "type" for a given TNU.