tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_COUNTRYCODE_STANDARD #20

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID 0493bcfb-652e-4d17-815b-b0cce0742fbe
Label VALIDATION_COUNTRYCODE_STANDARD
Description Is the value of dwc:countryCode a valid ISO 3166-1-alpha-2 country code?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:countryCode
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode is bdq:Empty; COMPLIANT if dwc:countryCode can be unambiguously interpreted as a valid ISO 3166-1-alpha-2 country code; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions COUNTRYCODE_STANDARD
Parameter(s)
Source Authority bdq:sourceAuthority default = "ISO 3166 Country Codes" {[https://www.iso.org/iso-3166-country-codes.html]} {ISO 3166-1-alpha-2 Country Code search [https://www.iso.org/obp/ui/#search]}
Specification Last Updated 2024-04-15
Examples [dwc:countryCode="GL": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:countryCode is a valid ISO (ISO 3166-1-alpha-2 country codes) value"]
[dwc:countryCode="GRL": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:countryCode is NOT a valid ISO (ISO 3166-1-alpha-2 country codes) value"]
Source TG2
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes Locations outside of a jurisdiction covered by a country code may have a value in the field dwc:countryCode, the ISO user defined codes include XZ used by the UN for installations on the high seas and suitable for a marker for the high seas. Also available in the ISO user defined codes is ZZ, used by GBIF to mark unknown countries. This test should accept both XZ and ZZ as COMPLIANT country codes. This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters.
iDigBioBot commented 6 years ago

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Added post scoring for consistency

ArthurChapman commented 6 years ago

Question from Gainesville meeting: what about records outside of country jurisdiction (e.g. Hugh seas). If material is from high seas - country code should be empty. Description altered to: If the dwc:countryCode contains a value but that value is not a valid ISO Code. Pass description is EMPTY if has a valid ISO Code.

We need a standard formalisation of what we mean by "EMPTY" which includes Empty, NULL, /N etc. Use EMPTY in all caps

chicoreus commented 5 years ago

This test should not be parameterized. Different users will not wish to apply different ISO country codes. The parameterized resource that provides a service for country code lookup is one (which we could recommend, though I suggest not, rather just mention) way of implementing this test, however most implementors will almost certainly want to work with a local data structure that holds the short list of valid country codes rather than making requests to a service on each data record. This parameter is (1) straying too far into forcing a particular form of implementation on implementors, and (2) requiring a parameterized version of the test when none is needed as all use cases refer to the same ISO country code list.

ArthurChapman commented 5 years ago

I have no idea why this is paramaterized. Particularly when the description says "dwc:countryCode is a valid ISO (ISO 3166-1-alpha-2 country codes)" @tucotuco - you paramaterized this - what was the thinking behind that?

tucotuco commented 5 years ago

I concur that this test should not be parametrized.

ArthurChapman commented 5 years ago

Removed paramaterized, added new references (Is the one I have listed as bbq:sourceAuthority - the best one here?)

tucotuco commented 5 years ago

To me, the best source is https://restcountries.eu/#api-endpoints-list-of-codes, simply because they have an API. I don't think it will matter too much. This is a straightforward controlled vocabulary.

Tasilee commented 5 years ago

@tucotuco: Agreed.

chicoreus commented 2 years ago

The assertion about external prerequisites not met in the specification is probably confusing in the absence of a defined source authority service.

We should probably note that implementers may use a service to obtain country codes (best implementation is probably for code to download and cache a list, e.g. the json copy from https://datahub.io/core/country-list) rather than making a service call for each run of the test. With a cached copy of country codes, implementors could have the choice of returning EXTERNAL_PREREQUISITES_NOT_MET if unable to refresh the cached copy on start up, or just using the cached copy, with how old the cache may be at a maximum being up to them. This could go in the notes rather than the specification (expected response).

Also noting that https://restcountries.eu/#api-endpoints-list-of-codes has been unresponsive of late.

Tasilee commented 2 years ago

Thanks @chicoreus. Local caching is likely to apply to a number of bdq:sourceAuthority. Why don't we just use the standard phrase "EXTERNAL_PREREQUISITES_NOT_MET if the bdq:SourceAuthority is not available", define our default in the Notes and leave the implementation up to the implementer?

tucotuco commented 2 years ago

"not available" seems like it covers the eventualities well. I like the solution.

ArthurChapman commented 2 years ago

I agree @tucotuco

Tasilee commented 2 years ago

Edited accordingly.

tucotuco commented 2 years ago

I suggest the Description:

'Is the value of dwc:countryCode a valid ISO 3166-1-alpha-2 country code?'

in place of:

'Is the value of dwc:countryCode a valid ISO country code value?'

Tasilee commented 2 years ago

Changed Description

Does the value of dwc:country occur as the equivalent of a nation in the bdq:sourceAuthority?

to

Does the value of dwc:countryCode occur as the equivalent of a nation in the bdq:sourceAuthority?

ArthurChapman commented 2 years ago

That is not the same as the ER

Should be

Is the value of dwc:countryCode a valid ISO 3166-1-alpha-2 country code value?

Tasilee commented 2 years ago

I agree and wonder how this was reverted.

Tasilee commented 2 years ago

Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."

ArthurChapman commented 1 year ago

See comments under #48. If we agree to a change there, then this one should also be changed to

EXTERNAL_PREREQUISITES_NOT_MET if a configured service for checking ISO 3166 country codes is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode was EMPTY; COMPLIANT if it can be unambiguously interpreted as a ISO 3166-1-alpha-2; otherwise NOT_COMPLIANT

and remove reference to bdq:sourcrAuthority under source:Authority

Tasilee commented 1 year ago

I agree, and wouldn't this apply also to

38, #133

bdq:sourceAuthority default = "Creative Commons" [https://creativecommons.org/]

62, #48

bdq:sourceAuthority is "ISO 3166-1-alpha-2" [https://restcountries.eu/#api-endpoints-list-of-codes, https://www.iso.org/obp/ui/#search]

and possibly a few more?

Tasilee commented 1 year ago

References need checking as restcountries.eu is no longer there. We will need a link check through all of the issues.

ArthurChapman commented 1 year ago

Suggested possibly using https://rapidapi.com/ajayakv/api/rest-countries - awaiting feedback

ArthurChapman commented 1 year ago

Updated References

Tasilee commented 1 year ago

Added Source Authority value as " {bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://dwc.tdwg.org/terms/#dwc:basisOfRecord]](https://www.iso.org/obp/ui/#search]}"

for now.

Tasilee commented 1 year ago

Amended Source Authority to align with @chicoreus suggested syntax

{bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://www.iso.org/obp/ui/#search]}

to

bdq:sourceAuthority default = "ISO 3166-1-alpha-2 country codes" { [https://www.iso.org/obp/ui/#search]}

Tasilee commented 1 year ago

Corrected syntax on Source Authority

Tasilee commented 1 year ago

Changed Expected response from

EXTERNAL_PREREQUISITES_NOT_MET if a configured service for checking ISO 3166 country codes is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode was EMPTY; COMPLIANT if it can be unambiguously interpreted as a ISO 3166-1-alpha-2; otherwise NOT_COMPLIANT

to

EXTERNAL_PREREQUISITES_NOT_MET if a configured service for checking ISO 3166 country codes is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode was EMPTY; COMPLIANT if it can be unambiguously interpreted as an ISO 3166-1-alpha-2 country code; otherwise NOT_COMPLIANT

and removed reference to bdq:sourceAuthority in Source Authority

Tasilee commented 1 year ago

Due to recent discussions, changed

EXTERNAL_PREREQUISITES_NOT_MET if a configured service for checking ISO 3166 country codes is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode was EMPTY; COMPLIANT if it can be unambiguously interpreted as a ISO 3166-1-alpha-2 country code; otherwise NOT_COMPLIANT

to

EXTERNAL_PREREQUISITES_NOT_MET if bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the dwc:countryCode was EMPTY; COMPLIANT if dwc:countryCode can be unambiguously interpreted as an ISO 3166-1-alpha-2 country code; otherwise NOT_COMPLIANT

and Source Authority from blank to

bdq:sourceAuthority default = "ISO 3166 Country Codes" {[https://www.iso.org/iso-3166-country-codes.html]} {ISO 3166-1-alpha-2 Country Code search [https://www.iso.org/obp/ui/#search]}

and updated Specification Last Updated

Tasilee commented 12 months ago

I'm making a start on splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". I am working through the tests in numerical sequence. Please check.

ArthurChapman commented 12 months ago

Does that affect Specification Last Updated?

Tasilee commented 12 months ago

I was wondering about that. Paul Call?

Tasilee commented 11 months ago

Removed TestField "Warning Type" as it highly correlates with "Data Quality Dimension". I will be working through the 99 CORE tests making this change but will not add separate Comments, so to reduce another round of email notifications.

chicoreus commented 6 months ago

Updated notes to replace "fail" with more explicit: This test must return NOT_COMPLIANT if there is leading or trailing whitespace or there are leading or trailing non-printing characters.

Tasilee commented 5 months ago

Playing with this test as an example for the Example Use Case (replaced 'Source')

Tasilee commented 5 months ago

Recent discussions by the team resulted in recommendations for how we handle Use Cases. This test is an example template for all other tests (CORE, Supplementary, DO NOT IMPLEMENT and Immature/Incomplete) and will not be further commented.

The Use Cases will be documented in the form of Label - Description in the Vocabulary (#152).

Use Case being Informative (non-normative), we will not update 'Specification Last Updated'. Speak now or forever ....

chicoreus commented 5 months ago

Note that the recommendation was that we place the UseCase definitions and relationships of UseCases to Tests in a separate document, not include the application of UseCases to tests in the tests, and rever this test to show the source, rather than replacing source with UseCase.

In the UseCase document, the UseCase label and guid should probably be normative, and the description, citations, and list of included tests should be informative.

On Thu, 11 Apr 2024 21:52:05 -0700 Lee Belbin @.***> wrote:

Recent discussions by the team resulted in recommendations for how we handle Use Cases. This test is an example template for all other tests (CORE, Supplementary, DO NOT IMPLEMENT and Immature/Incomplete) and will not be further commented.

The Use Cases will be documented in the form of Label - Description in the Vocabulary (#152).

Use Case being Informative (non-normative), we will not update 'Specification Last Updated'. Speak now or forever ....

Tasilee commented 5 months ago

Changed "was" to "is" to align with standard phrasing in ER as in "INTERNAL_PREREQUISITES_NOT_MET if xxx is EMPTY"

chicoreus commented 3 weeks ago

Discussion of how to mark the high seas in TG2 working group meeting in Seattle: Options are "High Seas" in dwc:country, used by three institutions feeding data to GBIF for about 60k occurrences, a three letter country code BNJ (beyond national jurisdiction) used by OBIS (along with BBNJ for biological material beyond national jurisdition), or the UN use of the User assigned codes in the ISO country code standard of XZ for installations on the high seas, in some use for marking origins from the high seas.

dwc:countryCode=XZ, country="" appears the best means to mark occurrences as being in the high seas, this is consistent with the darwin core specification of ISO country code values.

Tasilee commented 3 weeks ago

Thanks to @davewatts3 for raising the BBJ and BBNJ possibilities which we have noted. Hopefully we get some solution to this in the future.