tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-AMENDMENT_COUNTRYCODE_STANDARDIZED #48

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID fec5ffe6-3958-4312-82d9-ebcca0efb350
Label AMENDMENT_COUNTRYCODE_STANDARDIZED
Description Proposes an amendment to the value of dwc:countryCode if it can be interpreted as an ISO country code.
TestType Amendment
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:countryCode
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISTITES_NOT_MET if the value of dwc:countryCode is bdq:Empty; AMENDED the value of dwc:countryCode if it can be unambiguously interpreted from bdq:sourceAuthority; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions COUNTRYCODE_STANDARDIZED
Parameter(s)
Source Authority bdq:sourceAuthority default = "ISO 3166 Country Codes" {[https://www.iso.org/iso-3166-country-codes.html]} {ISO 3166-1-alpha-2 Country Code search [https://www.iso.org/obp/ui/#search]}
Specification Last Updated 2023-09-17
Examples [dwc:countryCode="Australia": Response.status=AMENDED, Response.result=dwc:countryCode="AU", Response.comment="dwc:countryCode contains an interpretable value"]
[dwc:countryCode="Aust.": Response.status=NOT_AMENDED, Response.result=, Response.comment="dwc:countryCode contains an ambiguous value"]
Source
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes
iDigBioBot commented 6 years ago

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Added post scoring for consistency

chicoreus commented 5 years ago

Output Type is missing from the Field/Value table. @Tasilee please correct (looks like it should be Output Type = Amendment) and review table for any other errors.

Tasilee commented 5 years ago

Thanks @chicoreus. Done.

ArthurChapman commented 5 years ago

Reference and Parameters updated in accordance with #20. Removed Parameterized.

tucotuco commented 5 years ago

Wait! No!

None of those references allows you to make an amendment, they only provide the controlled vocabulary. I have been working on a general response in issue #178 all day, redoing it several times to try to be succinct. It will be forthcoming momentarily.

Tasilee commented 5 years ago

@tucotuco: Given your comments on #178, I am presuming you are implying that we cannot AMEND without a thesaurus as in the example lookup "AU" -> "036" ?

In this case, we sort of do have something akin to a thesaurus in the References where we are implying that we code in the test something like "find AU among terms {...} and extract best matching ISO 3166-1-alpha-2 country code" ?

tucotuco commented 5 years ago

@Tasilee I don't think the pattern matching method in implementations is a good idea for reasons stated elsewhere. I still think a maintained lookup thesaurus is the way to go.

Tasilee commented 5 years ago

@tucotuco - given your comments on #178 - I agree. Thanks John. I guess the process of converting me to the need for thesauri is something we can use in the paper and education generally.

Tasilee commented 1 year ago

From recent discussions, I have changed the Expected Response from

EXTERNAL_PREREQUISITES_NOT_MET if the ISO 3166 service is not available; INTERNAL_PREREQUISTITES_NOT_MET if the value of dwc:countryCode is EMPTY; AMENDED the value of dwc:countryCode if it can be unambiguously interpreted from bdq:sourceAuthority; otherwise NOT_AMENDED

to

EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISTITES_NOT_MET if the value of dwc:countryCode is EMPTY; AMENDED the value of dwc:countryCode if it can be unambiguously interpreted from bdq:sourceAuthority; otherwise NOT_AMENDED

chicoreus commented 1 year ago

This one shouldn't be parameterized. The source authority is also entangling the authority (per the Note on dwc:countryCode "Recommended best practice is to use an ISO 3166-1-alpha-2 country code.", ISO 3166-1-alpha-2) with places where the (non-free) authority may be available or consulted. Also, the endpoint specified, https://restcountries.eu /#api-endpoints-list-of-codes appears to have been unresponsive for some time.

Consider amending (with no parameter, and no source authority) to:

EXTERNAL_PREREQUISITES_NOT_MET if a configured service for checking ISO 3166 country codes is not available; INTERNAL_PREREQUISTITES_NOT_MET if the value of dwc:countryCode is EMPTY; AMENDED the value of dwc:countryCode if it can be unambiguously interpreted as a ISO 3166-1-alpha-2; otherwise NOT_AMENDED

In some places we take the position that for data to have quality for CORE purposes, it must conform with the recommendation in the non-normative Note/Comment of a Darwin Core term. This is a case where we should do that.

We also make no recommendations about the scope of values that an implementation should consider in assessing " unambiguously interpreted", these could include case insensitive matches, matches on ISO 3 letter codes, matches on ISO numeric codes, matches on country names, matches and conflicting matches imparting ambiguity on other codes.

chicoreus commented 1 year ago

Another interpretation of the parameter is that it implies the existence of one or more service that can take a string, and return a two letter ISO code that can be unambiguously interpreted for that string. Such a service doesn't exist.

ArthurChapman commented 1 year ago

Agreed a lot of what you say @chicoreus. We removed Parameterized back in September 2019. The citing of bdq:sourceAuthority does not mean it is Parameterized - there are two tests like that - the other is #20. If we change the wording here - we should do likewise in #20. I am happy to go with your wording above and remove reference to bdq:sourceAuthority

Tasilee commented 1 year ago

I agree, but if the Expected response is self-contained, then there is no need for a bdq:sourceAuthority?

ArthurChapman commented 1 year ago

As suggested in #20 - if agreed - can remove the reference to bdq:sourceAuthority in Source Authority - and it shouldn't have anything in Parameter(s) anyway - as it is not Parameterized.

Tasilee commented 1 year ago

OK, go for it, but we need to capture "https://restcountries.eu/#api-endpoints-list-of-codes, https://www.iso.org/obp/ui/#search" in References, and given what @chicoreus said this morning, we should use a standard phrase to alert implementers...

API endpoint [https://restcountries.eu/#api-endpoints-list-of-codes, https://www.iso.org/obp/ui/#search]

?

Tasilee commented 1 year ago

Removed "bdq:sourceAuthority" from Parameter(s) as there is only one: ISO 3166.

Tasilee commented 1 year ago

Amended Source Authority value from

"bdq:sourceAuthority is "ISO 3166-1-alpha-2" [https://restcountries.eu/#api-endpoints-list-of-codes, https://www.iso.org/obp/ui/#search]"

to

"{bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://www.iso.org/obp/ui/#search]}"

to align with suggested syntax and better link.

Tasilee commented 1 year ago

Amended Source Authority to align with @chicoreus suggested syntax

{bdq:sourceAuthority = ISO 3166-1-alpha-2} { Country codes [https://www.iso.org/obp/ui/#search]}

to

bdq:sourceAuthority default = "ISO 3166-1-alpha-2 country codes" { [https://www.iso.org/obp/ui/#search]}

Tasilee commented 1 year ago

Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:

bdq:sourceAuthority default = "ISO 3166-1-alpha-2 country codes" {[https://www.iso.org/obp/ui/#search]}

to

bdq:sourceAuthority default = "ISO 3166 Country Codes" {[https://www.iso.org/iso-3166-country-codes.html]} {ISO 3166-1-alpha-2 Country Code search [https://www.iso.org/obp/ui/#search]}

Tasilee commented 12 months ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Field" to "TestField" and "Output Type" to "TestType".