tdwg / dwc

Darwin Core standard for sharing of information about biological diversity.
https://dwc.tdwg.org
Creative Commons Attribution 4.0 International
205 stars 70 forks source link

New term - typifiedName #28

Open tucotuco opened 9 years ago

tucotuco commented 9 years ago

New Term

Submitter: Markus Döring Justification: Clear separation of the type status and the typified scientific name that is typified by a type specimen, the subject. Looking at how dwc:typeStatus has been used in all of GBIFs specimen data one can see there is the need to express this, but it should better be handled with a term on its own and leave typeStatus for the status of the type only. The term name itself is also used by ABCD: http://wiki.tdwg.org/twiki/bin/view/ABCD/AbcdConcept0603 Organized in Class (e.g., Occurrence, Event, Location, Taxon): Identification Definition: Scientific name of which Organism is a nomenclatural type. Comment: It is recommended to also indicate the typeStatus of the Organism. Refines: None Replaces: None ABCD 2.06: DataSets/DataSet/Units/Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypifiedName

Original comment:

Was https://code.google.com/p/darwincore/issues/detail?id=197

==New Term Recommendation== Submitter: Markus Döring

Justification: Clear separation of the type status and the typified scientific name that is typified by a type specimen, the subject. Looking at how dwc:typeStatus has been used in all of GBIFs specimen data one can see there is the need to express this, but it should better be handled with a term on its own and leave typeStatus for the status of the type only. The term name itself is also used by ABCD: http://wiki.tdwg.org/twiki/bin/view/ABCD/AbcdConcept0603

Definition: The scientific name that is based on the type specimen.

Comment: It is recommended to also indicate the typeStatus of the specimen.

Refines:

Has Domain:

Has Range:

Replaces:

ABCD 2.06: DataSets/DataSet/Units/Unit/SpecimenUnit/NomenclaturalTypeDesignations/NomenclaturalTypeDesignation/TypifiedName

A typical example how typeStatus is used currently is:

ISOTYPE of Polysiphonia amphibolis Womersley

which we could express much better with 2 terms:

dwc:typeStatus=ISOTYPE dwc:typifiedName=Polysiphonia amphibolis Womersley

nielsklazenga commented 3 years ago

Okay, let's move on.

Can we get back to the definition of the term?

The definition given at the top of the proposal is:

The scientific name that is based on the type specimen.

I proposed to change that (29 comments ago) to:

Scientific name of which the specimen is a nomenclatural type

Arguments:

mjy commented 3 years ago

I second that proposal and feel it is much better. "Is" reflects the objective nature of this assertion. We are asserting a fact (that with proper identifiers on the specimens is about objective as we can be) by putting data in this field. There is no "based" in this assertion.

deepreef commented 3 years ago

I also agree with the the revised wording proposed by @nielsklazenga

nielsklazenga commented 3 years ago

I have another proposal. As it now stands, it could be (and will be imminently) argued that this proposal does not meet the stability requirement in the VMS (https://github.com/tdwg/vocab/blob/master/vms/maintenance-specification.md#31-justifications-for-change), as it changes the usage of dwc:typeStatus in a way that is not in accordance with its definition.

In order for this proposal to work, we need to either:

I propose the latter option, as, if anything, it is the easier option, because changing the definition of dwc:typeStatus will also run afoul of the stability requirement and also because it is the much more elegant solution. Having the extra term will allow us to provide the entire typification as a string, which has detail, as well as in separate properties for each of its components, so that it is more easily processed.

So, using @mdoering's initial example:

and/or

and perhaps:

[That was just me indulging myself; don't do that when doing typification in the Occurrence Core, or anywhere else where there can be more than one scientific name in a record]

dwc:typeStatus still has some detail that is not in any of the other properties.

I think this deals entirely and properly with typification when coming from the specimen rather than the name and is not a "band-aid" solution.

I am happy to write the proposal for the new property if this gets the thumbs-up.

mjy commented 3 years ago

Deprecation and new alternatives is always good. typifiedName + typeOfType is more precise and in line with how data are actually captured. It's also more actionable as I don't have to write a regex to actually use the data, and I don't have to have a controlled vocabulary (type of types) to seed that regex.

nielsklazenga commented 3 years ago

@mdoering , I noticed that there is a 'Class - Taxon' label on this issue. Are you able to change that?

deepreef commented 3 years ago

I'd be OK with that; but what about typificationPublication? That is also part of the existing typeStatus definition. And another component that's missing from the existing typeStatus definition, but is critically important, is something like typificationMethod (e.g., original designation, subsequent designation, monotypy, absolute tautonomy, etc.). It could be argued that this should wait for TCS 2; but the same argument could also be made for typifiedName and typeOfType.

nielsklazenga commented 3 years ago

I'd refer to what @tucotuco said earlier in this thread:

A typifiedName term didn't get created because the community did not make a request for it before or during the public review and one of the principles of Darwin Core is to not add anything that isn't demanded. And now that is embodied in section 3.1 Justifications for change in the Vocabulary Maintenance Specification (https://github.com/tdwg/vocab/blob/master/vms/maintenance-specification.md).

I looked it up in the VMS and that is the efficacy requirement.

I think the demand to have the kind of type and the typified name as separate properties have now been demonstrated. I personally have no need for the other two properties you mention, but I certainly will not oppose them, if someone makes the case for them.

By the way, this (the fact that there is more to dwc:typeStatus than the kind of type and the typified name) is one of the reasons why I want to keep typeStatus in its current form.

mdoering commented 3 years ago

@mdoering , I noticed that there is a 'Class - Taxon' label on this issue. Are you able to change that?

Yes. I changed it to Identification - or do we want (also) Occurrence?

nielsklazenga commented 3 years ago

Thanks Markus, no, let's not open that box again.

mdoering commented 3 years ago

I think I am fine with the proposal to also create typeOfType - or at least some new term for this. I don't think the term typeOfType is very intuitive, I would not know what this is about without reading. Which might be a good thing.

The current typeStatus definition is slightly weird as it is a bit recursive:

A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject.

So dwc:typeStatus is definied to be composed of 3 things, the first being the type status. That to me flags that the status of a type specimen is just the typeOfType and no more. Maybe changing the definitions would not be considered a breaking change in that light? If you search in GBIF for specimens with a type status it is just that, the status: https://www.gbif.org/occurrence/search?type_status=HOLOTYPE

You find this use in many places and literature. I still think the intuitive definition for typeStatus would just be the status, and no scientific name or publication in there.

deepreef commented 3 years ago

@mdoering :

I still think the intuitive definition for typeStatus would just be the status, and no scientific name or publication in there.

Yes, this has been my point all along. The original definition for typeStatus was to capture what already exists in most specimen databases, which is a single word consistent with what @nielsklazenga proposes for typeOfType. I think it was a mistake to change the definition of this term (as was done in 2007) to include both the typeStatus as originally defined, plus the name that is typified, plus the publication that established the typeStatus. The circularity of the definition (i.e., that a part of the value for typeStatus is "type status") belies how the term was overloaded. I must not have been paying attention in 2007 when the term was redefined to include additional components that we're now referring to as typifiedName and what we might call typificationPublication (or something like that).

If I understand the current proposal, it is to leave the current (post 2007) definition of typeStatus as is, with three included elements (what @nielsklazenga refers to as typeOfType, plus the content of the proposed new term typifiedName, plus the publication information referring to when the type was fixed). Then add an additional term for typeOfType to capture what typeStatus was originally defined for (pre-2007), and also add an additional term for typifiedName to capture the second element of what the current definition of typeStatus is meant to include.

In summary: pre-2007: typeStatus = single word (Holotype, Paratype, Lectotype, Isotype, etc.) -- and if I understand you correctly, most GBIF content still represents it this way.

post-2007: 'typeStatus` = "A list (concatenated and separated) of nomenclatural types (type status, typified scientific name, publication) applied to the subject." This suggests to me that the term "nomenclatural type" consists of three pieces of information ("type status", "typified scientific name", and "publication"), and if a specimen has been designated a type of more than one name, then all three pieces should be included for each type designation, with each set-of-three-pieces (i.e., each "nomenclatural type") concatenated and separated in the form of a list. Or maybe the three pieces are what should be concatenated and separated from each other based on the assumption that there is only one nomenclatural type for each subject -- it's not clear.

Current proposal(s):

I've already made it clear that I think the definition of typeStatus should revert to its pre-2007 form (aka, what @nielsklazenga refers to as typeOfType), and if we need the other elements (typified scientific name and/or publication), then new terms should be proposed for those. But I've spent too much on this discussion already, so I'm not going to make this argument formally.

nielsklazenga commented 3 years ago

@mdoering , @deepreef, I have copied your last comments to #327.

nielsklazenga commented 3 years ago

Not so much a proposal as a summary of what we discussed earlier:

Usage comment: typifiedName should be used together with typeStatus.

This is typeStatus with the altered definition proposed in issue #328.

baskaufs commented 3 years ago

I believe that since this term is specifically talking about a name, it is probably going to only be expressed as a dwc: term and not have a dwciri: analog. We avoided going down the road of trying to create dwciri: analogs for Taxon class terms under the assumption that a TCS revision would take care of an RDF representation of "taxa" (whatever that means) at some point in the future (actually, now, since TCS IS under revision at the moment). But this is outside of my expertise, maybe @nielsklazenga can weigh in on this opinion.

nielsklazenga commented 3 years ago

@baskaufs, yes, that's right.

tucotuco commented 3 years ago

Since this proposal is dependent on the change to typeStatus proposed in Issue #328, and that issue was withdrawn from consideration for current public review, I am removing this issue as well.

timrobertson100 commented 1 year ago

Since this proposal is dependent on the change to typeStatus proposed

I'm not sure this is entirely true, is it? I understand the issue with altering the definition of typeStatus (even though I think reverting is the most sensible thing to do) but I sense there is a desire to have typifiedName in its own right. We might also consider that even if the name is encoded in typeStatus as is currently recommended, we have the precedent in DwC to have an additional field since e.g. specificEpithet exists when scientificName could arguably suffice.

As per this GBIF issue ABCD has this term, and GBIF already extracts and creates it in a GBIF namespace for downloads and view. It would be better if the publisher could simply declare it and GBIF was using DwC terms for the search (e.g. /occurrence/search?typifiedName=...) and download representations.

tucotuco commented 1 year ago

The dependency on typeStatus is based on this comment.

tucotuco commented 1 year ago

It would be great to get clarification on whether there is any dependency on issue #328, which remains controversial. In any case, this term can be included in the next Darwin Core public review.

nielsklazenga commented 1 year ago

I am still very much in favour of having this property.

In TCS 2, we define a NomenclaturalType class that can also be used as object with dwciri:typeStatus. I think it would be good to have dwc: analogues for tcs:typifiedName and tcs:typeOfType (the other properties are probably not as relevant when dealing with specimen data), but it might be best to wait until TCS 2 has been ratified. I hope this can be before the end of the year.

I propose to close #328 (which I opened).

tucotuco commented 1 year ago

Controversial status eliminated. Recommendation to create Task Group removed. This will be included in the next round of public review. Thanks @nielsklazenga.

ArthurChapman commented 2 months ago

From the TDWG Biodiversity Data Quality Interest Group - we strongly support this proposal.

In trying to run our Data Quality Tests (specifically https://github.com/tdwg/bdq/issues/285 and https://github.com/tdwg/bdq/issues/286) we are testing against "Darwin Core typeStatus" {[https://dwc.tdwg.org/list/#dwc_typeStatus]} {dwc:typeStatus vocabulary API [https://gbif.github.io/parsers/apidocs/org/gbif/api/vocabulary/TypeStatus.html]} vocabularies which only have the first part of the dwc:typeStatus data.

We are looking at using pipes (|) to test against just the first part of the data in the DwC term, but it makes sense to have separate terms as suggested by @mdoering - i.e. dwc:typeStatus and dwc:typifiedName (or equivalent - something like dwc:typeStatusCitation)

I am sure that many databases only use the first part of the DwC term - i.e. the Type of Type - and I can see many cases where this would be valuable in both collection management and database maintenance.

Until such a change is implemented by Darwin Core, we will have no option but to attempt to separate the information using pipes for our tests.