Open Tasilee opened 2 years ago
TestField | Value |
---|---|
GUID | d257eb98-27cb-48e5-8d3c-ab9fca4edd11 |
Label | VALIDATION_COUNTRYSTATEPROVINCE_UNAMBIGUOUS |
Description | Is the combination of the values of the terms dwc:country, dwc:stateProvince unique in the bdq:sourceAuthority? |
TestType | Validation |
Darwin Core Class | dcterms:Location |
Information Elements ActedUpon | dwc:country |
dwc:stateProvince | |
Information Elements Consulted | |
Expected Response | EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are bdq:Empty; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved to a single result with a child-parent relationship in the bdq:sourceAuthority and the entity matching the value of dwc:country in the bdq:sourceAuthority is an ISO 3166 country-like administrative entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT |
Data Quality Dimension | Conformance |
Term-Actions | COUNTRYSTATEPROVINCE_UNAMBIGUOUS |
Parameter(s) | bdq:sourceAuthority |
Source Authority | bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]} |
Specification Last Updated | 2024-09-18 |
Examples | [dwc:country="Argentina", dwc:stateProvince="Rio Negro": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country and dwc:stateProvince are unambiguous"] |
[dwc:country="", dwc:stateProvince="WA": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:country and dwc:stateProvince are ambiguous. Matches Western Australia, Washington State (US)"] | |
Source | VertNet, Kurator |
References |
|
Example Implementations (Mechanisms) | Kurator |
Link to Specification Source Code | https://github.com/kurator-org/kurator-validation/blob/master/packages/kurator_dwca/workflows/dwca_geography_assessor.yaml |
Notes | See table https://github.com/tdwg/bdq/issues/95#issuecomment-1226450014. A fail condition may arise from the content being internally inconsistent (not all of the information can be true at the same time), or from the vocabulary being incapable of uniquely resolving the combination of term values. This test specifically does not consider the content of dwc:higherGeography. If dwc:country contains a value and dwc:stateProvince does not, this test will return NOT_COMPLIANT. Use cases where knowledge to the level of country is adequate for the fitness of the data should not include this test. @tucotuco: "Of #200 and #201, #201 is the strongest test. If it passes for a record, #200 must necessarily also pass and doesn't tell you anything. If #201 fails,#200 could still pass and that would tell you that there are multiple matches on the dwc:country/dwc:stateProvince combo: It would tell you the nature of the problem. Along with #42 (dwc:country not empty), #200 would tell you whether there was an ambiguous combination of country (not empty) and dwc:stateProvince, such as would happen with Argentina/Buenos Aires. While if country is empty, then the ambiguity is purely at the dwc:stateProvince level". |
Suggest modifying the Expected Response (changes in italics)
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if either of the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
I don't think that is right. As per @tucotuco examples with #95, we are testing for ambiguity and one of the terms can be empty.
I don't think that is right. As per @tucotuco examples with #95, we are testing for ambiguity and one of the terms can be empty.
I agree, it is correct as "INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY".
How about:
EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if the terms dwc:country and dwc:stateProvince are EMPTY; COMPLIANT if the combination of values of dwc:country and dwc:stateProvince are unambiguously resolved to a single result with a child-parent relationship in the bdq:sourceAuthority and the entity matching the value of dwc:country in the bdq:sourceAuthority is an ISO country-like entity in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
This phrasing avoids a compliant result from missmapping of dwc:county onto stateProvince and stateProvince onto country, or instances where dwc:country and dwc:stateProvince are switched.
Done
Added to Notes: "This test will fail if there are leading or trailing white space or non-printing characters."
In the Notes the Reference to "See table #95 (comment)" (i.e. "See table https://github.com/tdwg/bdq/issues/95#issuecomment-1226450014)" will need to be updated - but not sure how we can reference the comment
Updated Parameter(s) value to align with other tests
Post Zoom 11/7/2023, I have aligned the Source Authority with the suggested syntax:
bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" [https://www.getty.edu/research/tools/vocabularies/tgn/index.html]
to
bdq:sourceAuthority default = "The Getty Thesaurus of Geographic Names (TGN)" {[https://www.getty.edu/research/tools/vocabularies/tgn/index.html]}
Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".
Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"
Removed inaplicable "fail" text from note. This is covered by unambigous in the specification, and leading/trailing whitespace should not block matches.
Updated Notes from @tucotuco's Comment https://github.com/tdwg/bdq/issues/21#issuecomment-2282949284 which I thought was needed here.
Altered Expected Response to add "administrative" entity
Added 3166 qualifier to the ISO ref in the Expected Response and added two ISO 3166 references