tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_COUNTRY_FOUND #21

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID 69b2efdc-6269-45a4-aecb-4cb99c2ae134
Label VALIDATION_COUNTRY_FOUND
Description Does the value of dwc:country occur in bdq:sourceAuthority?
TestType Validation
Darwin Core Class Location
Information Elements ActedUpon dwc:country
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:country is EMPTY; COMPLIANT if value of dwc:country is a place type equivalent to "nation" by the bdq:sourceAuthority; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions COUNTRY_FOUND
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}
Specification Last Updated 2024-04-15
Examples [dwc:country="Eswatini": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country is a valid country name according to The Getty Thesaurus of Geographic Names (2021-03-30)."]
[dwc:country="Swaziland": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="Eswatini is the preferred name according to The Getty Thesaurus of Geographic Names (2021-03-30)."]
Source ALA, GBIF
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes Non-country information such "high seas" will fail this test. Multiple values in the dwc:country field (whether to signify on a border or in a list of possibilities) will fail this test. Locations outside of a jurisdiction covered by a country code should not have a value in the field dwc:countryCode. This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters.
iDigBioBot commented 6 years ago

Comment by Anonymous migrated from spreadsheet: None

iDigBioBot commented 6 years ago

Comment by Arthur Chapman (@ArthurChapman) migrated from spreadsheet: Is this something that should be added to NOTES column?

iDigBioBot commented 6 years ago

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: In cases where there is no country, test only useful AFTER interpretation of country from coordinates

iDigBioBot commented 6 years ago

Comment by Paul Morris (@chicoreus) migrated from spreadsheet: Treat "High Seas" as a valid country value. Country of origin, or high seas is critical information for Nagoya Protocol implementation.

ArthurChapman commented 6 years ago

Discussion from Gainesville meeting: Should we be using current country? of country at time of collection? For may reasons, it was agreed that for this test, the country should be current country.

For the Sequence - see Paula's comment above - run validations, run amendments - run validations again (Paul) Will handle a number of cases where concerned about sequence.

John: do we require country to be a written out version of a country code or can be other political entity - for example United Kingdom.n Important test. Not necessarily an error, but a warning that the country is not a standard modern country. French Indo-China could mean any one of a number of current countries.

Agreed to apply to current ISO countries. Question why ISO and not Getty Thesaurus. Geonames another suggestion. Getty Thesaurus has hierarchies.

Test for now could just as "in a vocabulary" and what vocabulary could change over time. DwC mentions Getty as a recommended vocabulary. We could thus go that way to be consistent with DwC. Put the ISO test to ISO, and the human readable "Country" to the richer Getty Thesaurus.

tucotuco commented 2 years ago

i suggest the Description:

'Does the value of dwc:country occur as the equivalent of a nation in the bdq:sourceAuthority?'

in place of:

'Does the value of dwc:country occur in bdq:sourceAuthority?'

ArthurChapman commented 2 years ago

That fits with the equivalent NAME tests

Tasilee commented 1 year ago

From the zoom meeting today, we agreed to align this test with the taxonomic counterparts by renaming "STANDARD" to "FOUND".

Tasilee commented 1 year ago

Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."

Tasilee commented 1 year ago

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" [https://www.getty.edu/research/tools/vocabularies/tgn/index.html]

to

bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}

Tasilee commented 10 months ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted"

chicoreus commented 4 months ago

Updated notes from "fail" to more specific: "This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters. "

Tasilee commented 2 months ago

Changed "was" to "is" to align with standard phrasing in ER as in "INTERNAL_PREREQUISITES_NOT_MET if dwc:country is EMPTY"