tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-AMENDMENT_TAXONRANK_STANDARDIZED #163

Open ArthurChapman opened 5 years ago

ArthurChapman commented 5 years ago
TestField Value
GUID e39098df-ef46-464c-9aef-bcdeee2a88cb
Label AMENDMENT_TAXONRANK_STANDARDIZED
Description Propose amendment to the value of dwc:taxonRank using bdq:sourceAuthority.
TestType Amendment
Darwin Core Class Taxon
Information Elements ActedUpon dwc:taxonRank
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:taxonRank is EMPTY; AMENDED the value of dwc:taxonRank if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions TAXONRANK_STANDARDIZED
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "GBIF TaxonRank Vocabulary" [https://api.gbif.org/v1/vocabularies/TaxonRank]} {"dwc:taxonRank vocabulary API" [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]}}
Specification Last Updated 2023-09-18
Examples [dwc:taxonRank="sp.": Response.status=AMENDED, Response.result=dwc:taxonRank="species", Response.comment="dwc:taxonRank contains an interpretable value according to the bdq:sourceAuthority"]
[dwc:taxonRank="sic.": Response.status=NOT_AMENDED, Response.result="", Response.comment="dwc:taxonRank does not contain an interpretable value according to the bdq:sourceAuthority"]
Source TDWG2018
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes
chicoreus commented 5 years ago

Added guid.

chicoreus commented 4 years ago

I'd be a little bit hesitant to recommend the GBIF rank vocabulary, it is zoologically centric (there is no entry for Division, and translations of Division are mapped onto phylum, which is at the same level but isn't the same rank), most of the obscure infrasubspecific ranks found in botany are absent (particularly lusus and prolus), virological realm and subrealm are missing, etc.

tucotuco commented 4 years ago

One could get propose the required changes to the GBIF vocabulary. I suspect paleo would have something to say about what is there as well.

ArthurChapman commented 4 years ago

I agree with @tucotuco - best is to propose some changes to GBIF's vocabulary. I am sure they would be receptive. There is no good alternative that covers everything so comprehensively. Incidentally, lusus and prolus are not formal ranks under the International Code of Algae, Fungi and Plants

Tasilee commented 4 years ago

@tucotuco - I also support the suggestion, and agree with @ArthurChapman that GBIF should be supportive

Tasilee commented 4 years ago

All: This is a random test to demo my updates to the table fields Parameter(s), References (in this case removed as there is no additional reference/s beyond the xml end point defined in Notes) and Notes is a draft template of what we agreed. Please check and thumbs or otherwise.

ArthurChapman commented 4 years ago

Looks OK - but in that case - that is a good reference - only other references would be to the Codes of Nomenclature and I'd prefer not to go there. Eventually, the default may be to an API, in which case you would want to retain the reference to the human readable vocabulary.

Tasilee commented 2 years ago

Corrected example: dwc:taxonRecord="sp." becomes dwc:taxonRank="Species" to dwc:taxonRank="sp." becomes dwc:taxonRank="Species"

ArthurChapman commented 2 years ago

I have attempted my first addition of a Definition

| Definition | An Amendment of the value in Taxon Rank to conform with the value obtained from a Paramaterized Source Authority. If no parameter is set, the source authority defaults to the latest Taxonomic Rank GBIF Vocabulary. |

ArthurChapman commented 2 years ago

Question. Is it OK here to use "Taxon Rank" in the definition - or should it be "dwc:taxonRank. I was trying to go towards plain English, but this may not work in all cases.

e.g. | Definition | An Amendment of the value in Taxon Rank to conform with the value obtained from a Paramaterized Source Authority. If no parameter is set, the source authority defaults to the latest Taxonomic Rank GBIF Vocabulary. | OR | Definition | An Amendment of the value in dwc:taxonRank to conform with the value obtained from a Paramaterized Source Authority. If no parameter is set, the source authority defaults to the latest Taxonomic Rank GBIF Vocabulary. |

chicoreus commented 2 years ago

@ArthurChapman yes, I think taxon rank is fine in the definition.

I'd suggest shorter.

The definitions do need to include a reference to a single record.

Perhaps:

| Definition | An Amendment of the value of Taxon Rank in a single record to conform with the value obtained from a specified Source Authority. I

chicoreus commented 2 years ago

Or:

| Definition | An Amendment of the value of Taxon Rank in a single record to conform an existing value to a specified controlled vocabulary. I

chicoreus commented 2 years ago

Or:

| Definition | An Amendment of the value of Taxon Rank in a single record to conform the provided value to a specified controlled vocabulary. I

I've found myself using "provided value" a lot in documentation in implementations.

ArthurChapman commented 2 years ago

Following @chicoreus's suggestion NB - I used the word conform as it is a Conformance test (a la the Framework)

| Description | An Amendment of the value of Taxon Rank in a single record to conform to the provided value from a specified controlled vocabulary. I

  1. I would be happy with that definition now (I fixed some minor words - to's and from's) - are we happy with "specified controlled vocabulary" or do we want to mention "Source Authority"? If the former, do we need to add "controlled vocabulary" to #152 or is it OK as accepted English?
  2. Should it be (in plain English) "...to the value provided from ..." rather than ''...to the provided value from..." We are going to have a lot of these (most of the amendments) so let's get it right from the start.
Tasilee commented 2 years ago

As suggested elsewhere, I would prefer to see consistent and ubiquitous usage of "bdq:sourceAuthority"

"...to the value from bdq:sourceAuthority" to be consistent.

ArthurChapman commented 2 years ago

If we are using plain English and "taxon Rank" rather than "dwc.taxonRank" wouldn't we use "Source Authority" rather than "bdq:sourceAuthority" - my thinking is that for a plain English definition, we shouldn't require a user to look up a vocabulary, etc. to see what is meant by "bdq:sourceAuthority". My question in #163 was if we wanted to use "specified Source Authority" or "specified controlled vocabulary" - the specified implies Paramaterization.

tucotuco commented 2 years ago

I have expressed parallel concerns in https://github.com/tdwg/bdq/issues/112#issuecomment-1073987232. I'll repeat some here.

1) This test also does not need to refer to a "record" and may be unnecessarily limiting to do so. 2) "Taxon Rank" is not defined anywhere and can be a source of ambiguity. dwc:taxonRank is rigorous and is supposed to be the subject of the Amendment. 3) For similar reasons, I agree with @Tasilee that "bdq:sourceAuthority" is far preferable to "Source Authority", which many people would also need to look up, and would not be able to because it does not exist. 4) In any case, "controlled vocabulary" is not correct. A controlled vocabulary alone would not be able to provide the value to amend to unless it was already in the controlled vocabulary, in which case the Amendment could never actually amend anything. The least capable vocabulary type that would work is a thesaurus. 5) The "thing" being described is a test (whose type is an "Amendment"). I think we should explicitly say it is a test.

Taking all these into account, I would formulate the description as: "A test that amends the value of dwc:taxonRank to conform with the corresponding preferred value given by the bdq:sourceAuthority, if it can be done unambiguously."

ArthurChapman commented 2 years ago

I accept most of the arguments by @tucotuco and I suggest the description - I have rearranged unambiguously and provided and included "specified" to indicate a paramaterized bdq:sourceAuthority:

| Description | A test that amends the value of dwc:taxonRank to unambiguously conform to the corresponding value provided from a specified bdq:sourceAuthority. I

ArthurChapman commented 2 years ago

Following further discussion by @chicoreus in Issue #112 and his arguement that we need to include "single record" (I am not fully convinced but can live with it) and adding a new field of Brief I suggest:

| Description | A test that proposes to amend the value of dwc:taxonRank in a single record to unambiguously conform to the corresponding value provided from a specified bdq:sourceAuthority. I

| Brief | Amendment proposed for dwc:taxonRank to standard value|

Thumbs up please if we agree and I can start working on the other tests.

Tasilee commented 2 years ago

I am not convinced of the need for “Brief”. If we can’t succinctly describe it in “Description”, we fail. Why have ‘two bites at the cherry’ when one will do. Ocham’s Razor.

tucotuco commented 2 years ago

@ArthurChapman I still think think that the description should not have the "single record" part in it, but I am willing to not fight it to move forward.

Tasilee commented 2 years ago

IF (and to me it is a big if) 'Single record' is REQUIRED for conformance with the Framework, then surely it should be a part of the specifications - as we had it previously (as "Resource Type"). We found that all the 'tests' were "single record" so we removed it.

chicoreus commented 2 years ago

@Tasilee

Consider the following RDF:

<rdf:Description rdf:about="urn:uuid:e39098df-ef46-464c-9aef-bcdeee2a88cb">
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#Specification"/>
    <rdfs:description rdf:datatype="http://www.w3.org/2001/XMLSchema#string">EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMENDED if the value of dwc:taxonRank was standardized using the bdq:sourceAuthority; otherwise NOT_AMENDED</rdfs:description>
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">AMENDMENT_TAXONRANK_STANDARDIZED</rdfs:label>
</rdf:Description>

<rdf:Description rdf:about="urn:uuid:f559e3a3-c756-44f2-a9c6-555d59c3535e">
    <hasSpecification xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:e39098df-ef46-464c-9aef-bcdeee2a88cb"/>
    <implementedBy xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:919e8e12-808c-46ca-81af-d90bdbf5321c"/>
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#Implementation"/>
</rdf:Description>

<rdf:Description rdf:about="urn:uuid:2d81389a-ffe5-479a-a7d6-7a131f553b36">
    <enhancementInContext xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:5c1b9090-af8c-4b8c-b77a-0c8de69ba690"/>
    <hasSpecification xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:e39098df-ef46-464c-9aef-bcdeee2a88cb"/>
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#AmendmentMethod"/>
</rdf:Description>

<rdf:Description rdf:about="urn:uuid:5c1b9090-af8c-4b8c-b77a-0c8de69ba690">
    <hasEnhancement xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:d1c371b0-87af-4fda-bde8-3f22f1abc745"/>
    <hasInformationElement xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="urn:uuid:9e1b5503-2ef5-4663-8d42-8ae66355389c"/>
    <hasResourceType xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="rt:SingleRecord"/>
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#ContextualizedEnhancement"/>
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Description needed here</rdfs:label>
</rdf:Description>

<rdf:Description rdf:about="urn:uuid:9e1b5503-2ef5-4663-8d42-8ae66355389c">
    <composedOf xmlns="http://rs.tdwg.org/ffdq#" rdf:resource="http://rs.tdwg.org/dwc/terms/taxonRank"/>
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#InformationElement"/>
</rdf:Description>

<rdf:Description rdf:about="urn:uuid:d1c371b0-87af-4fda-bde8-3f22f1abc745">
    <rdf:type rdf:resource="http://rs.tdwg.org/ffdq#Enhancement"/>
    <rdfs:label rdf:datatype="http://www.w3.org/2001/XMLSchema#string">Conformance: standardized</rdfs:label>
</rdf:Description>

I'm seeing a need for a label on the ContextualizedEnhancement, where the text "Description needed here" is.

Tasilee commented 2 years ago

Why not "Description needed here" -> "Single record"? :)

Tasilee commented 2 years ago

Looking to conform the Expected Response of AMENDMENTs to "AMEND {target} if ...", this one could become 'trivial':

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMEND dwc:taxonRank using bdq:sourceAuthority; otherwise NOT_AMENDED

We do have the precursor #161 and #162 but we also agreed that the 'tests' should be stand-alone. Maybe the ER should be something like

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMEND the value of dwc:taxonRank if this value can be matched to a standard value in bdq:sourceAuthority; otherwise NOT_AMENDED

?

ArthurChapman commented 2 years ago

Note that we have used AMENDED everywhere - not AMEND and fits with NOT_AMENDED

Agree to your change - I checked all the Descriptions so that we said "value of"

Tasilee commented 1 year ago

Done

chicoreus commented 1 year ago

Proposal from discussion on 2022-12-11: Concern, what to do when the value is "especie", or another synonym or language variant in the specified vocabulary, expectation of amend to standard term would be that this would be that this value would be amended to "species", the standard term in the vocabulary. This could be undesirable for some users who wish to use a particular language variant for their data. Thus, similar to our handling of different national standards for geodeticDatum, proposal would be for a parameter for this, and for other amendments that conform data to a vocabulary where the parameter would allow one of three cases: (1) The amendment propose the standard term in the vocabulary, using language variants and synonyms in the vocabulary as values to be amended to the standard form, or (2) The amendment propose a particular (specified) language variant of the vocabulary term, e.g. converting all values in taxonRank to a spanish form, or (3), treating any synonym or language variant in the vocabulary as valid, and only attempting (if possible) to conform values to the vocabulary if they don't occur anywhere within it (e.g. conforming case, SPECIES to species, and ESPECIE to especie),

chicoreus commented 1 year ago

Per note by @timrobertson100 in #170 we should probably switch the Source Authority to the more current, complete, and comprehensive one at https://registry.gbif.org/vocabulary/TaxonRank/concepts (available in machine readable form at https://api.gbif.org/v1/vocabularies/TaxonRank/concepts ). Likewise, we should make a similar change in #162

ArthurChapman commented 1 year ago

Updated "Source Authority" and "References" in accord with @chicoreus comment above. @Tasilee to check.

Tasilee commented 1 year ago

Thanks @ArthurChapman: Checked.

Tasilee commented 1 year ago

Could I suggest we replace the Expected response

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMENDED the value of dwc:taxonRank if this value can be matched to a standard value in bdq:sourceAuthority; otherwise NOT_AMENDED

with

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMENDED the value of dwc:taxonRank if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

?

Tasilee commented 1 year ago

Is it odd that the Expected Response doesn't contain an INTERNAL_PREREQUISITES_NOT_MET? Should we have something like

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:taxonRank is EMPTY; AMENDED the value of dwc:taxonRank if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

?

ArthurChapman commented 1 year ago

That would make sense to me for consistency, but without it - it still works.

Tasilee commented 1 year ago

The determining factor is whether it is more appropriate to return INTERNAL PREREQUISITES_NOT_MET or NOT_AMENDED if dwc:taxonRank is EMPTY.

tucotuco commented 1 year ago

I would prefer returning INTERNAL PREREQUISITES_NOT_MET. It tells you better where the test failed, not just that it did.

Tasilee commented 1 year ago

OK, then I have edited the Expected Response

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; AMENDED the value of dwc:taxonRank if this value can be unambiguously matched to a value in bdq:sourceAuthority; otherwise NOT_AMENDED

to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL PREREQUISITES_NOT_MET if dwc:taxonRank is EMPTY; AMENDED the value of dwc:taxonRank if it can be unambiguously matched to a term in bdq:sourceAuthority; otherwise NOT_AMENDED

Tasilee commented 1 year ago

Restructured Parameter(s) and Source authority

Tasilee commented 1 year ago

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "GBIF Vocabulary: Taxonomic Rank" [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]

to

bdq:sourceAuthority default = "Darwin Core" {https://dwc.tdwg.org} {dwc:taxonRank [https://dwc.tdwg.org/list/#dwc_taxonRank]} {GBIF Vocabulary: Taxonomic Rank [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]}

chicoreus commented 1 year ago

On Mon, 10 Jul 2023 18:21:34 -0700 Lee Belbin @.***> wrote:

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "GBIF Vocabulary: Taxonomic Rank" [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]

This is probably a case where we do want to assert that the GBIF Vocabulary is the source authority, as it provides a controled vocabulary, while Darwin Core does not.

Tasilee commented 1 year ago

From @chicoreus's comment (https://github.com/tdwg/bdq/issues/162#issuecomment-1629955600), changed Source Authority from

bdq:sourceAuthority default = "Darwin Core" {https://dwc.tdwg.org} {dwc:taxonRank [https://dwc.tdwg.org/list/#dwc_taxonRank]} {GBIF Vocabulary: Taxonomic Rank [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]}

to

bdq:sourceAuthority default = "GBIF Vocabulary: Taxonomic Rank" {[https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]} {dwc:taxonRank [https://dwc.tdwg.org/list/#dwc_taxonRank]}

Tasilee commented 10 months ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

Tasilee commented 3 months ago

Changed Source Authority from

bdq:sourceAuthority default = "GBIF Vocabulary: Taxonomic Rank" {[https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]} {dwc:taxonRank [https://dwc.tdwg.org/list/#dwc_taxonRank]}

to

bdq:sourceAuthority default = "GBIF TaxonRank Vocabulary" [https://api.gbif.org/v1/vocabularies/TaxonRank]} {"dwc:taxonRank vocabulary API" [https://api.gbif.org/v1/vocabularies/TaxonRank/concepts]}