tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_COUNTRY_NOTEMPTY #42

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID 6ce2b2b4-6afe-4d13-82a0-390d31ade01c
Label VALIDATION_COUNTRY_NOTEMPTY
Description Is there a value in dwc:country?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:country
Information Elements Consulted dwc:countryCode
Expected Response COMPLIANT if dwc:country is bdq:NotEmpty or dwc:countryCode has a value of "XZ" and either dwc:country is bdq:Empty or has a value of "High seas"; otherwise NOT_COMPLIANT ?
Data Quality Dimension Completeness
Term-Actions COUNTRY_NOTEMPTY
Parameter(s)
Source Authority
Specification Last Updated 2024-09-27
Examples [dwc:country="Eswatini": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:country is bdq:NotEmpty"]
[dwc:country="": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:country is bdq:Empty"]
Source
References
Example Implementations (Mechanisms) FilteredPush: geo_ref_qc
Link to Specification Source Code geo_ref_qc DwCGeoRefDQ.validationCountryNotEmpty
Notes Country is expected to be either bdq:Empty or ideally have a value of "High seas" or an agreed equivalent if material comes from the high seas, or from those portions of Antarctica outside of any sovereign nation.
iDigBioBot commented 6 years ago

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Added post scoring for consistency

tucotuco commented 6 years ago

Agreed at TDWG 2018 DQIG meeting that the original assessment of the core nature of the test was correct based on consistency with other tests and participation in dependencies for other tests.

Tasilee commented 1 year ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". Also changed "Fields" to "TestField" and "Output Type" to "TestType".

chicoreus commented 8 months ago

Added comment "Country is expected to be empty if material comes from the high seas, or from those portions of Antartica outside of any sovereign nation." from duplicate #223.

Tasilee commented 1 month ago

Corrected spelling of "Antarctica" in Notes.

Also: This test is an equivalent to #98 in returning NOT_COMPLIANT for records outside national jurisdictions ('high seas') where dwc:country is rightly bdq:Empty. The point is that we may be reducing 'quality' with valid 'high seas' records. The marine community may not be amused.

We need to handle these two tests similarly.

ArthurChapman commented 1 month ago

As I commented on under #98. This test - like all other tests for NOTEMPTY - is only checking if there is a value in that field - it makes no assumption on why it is empty. It is a simple YES/NO test.

tucotuco commented 1 month ago

Corrected spelling of "Antarctica" in Notes.

Also: This test is an equivalent to #98 in returning NOT_COMPLIANT for records outside national jurisdictions ('high seas') where dwc:country is rightly bdq:Empty. The point is that we may be reducing 'quality' with valid 'high seas' records. The marine community may not be amused.

We are not reducing quality. We are providing an alert in contexts where such an alert is likely to be useful. The marine community would not use the test in contexts where it is not useful, so they should not have cause to not be amused.

Tasilee commented 1 month ago

@tucotuco: Fair comment, but by the criterion that our tests are 'widely applicable', I would say that that this test, #62 and #98 are candidates for Supplementary - to be used in a more specific context?

ArthurChapman commented 1 month ago

I don't see how these NOTEMPTY tests are different from any other NOTEMPTY tests. I see them as valuable tests

tucotuco commented 1 month ago

...and still widely aplicable.

chicoreus commented 1 month ago

I detect several fundamental misunderstandings here. First: Tests can only be understood in the context of use cases. One of our use cases is Spatial-Temporal Patterns. Others are free to compose tests in whatever other use cases they wish, but we have asserted this as an important use case for biodiversity data. If tests that assert NOT_COMPLIANT for marine data where there is no country are include in this use case then they will make all marine data outside of national boundaries unfit for use under that use case. Others may compose tests differently, but we are composing them this way, and without a general solution (which could be XZ in dwc:countryCode, and including that as an information element in other tests), we are asserting that NO marine data outside EEZs is fit for use for Spatial-Temporal Patterns.

Second: Knowledge that material comes from (and observations were made) within national boundaries, or from areas outside national jurisdiction is becoming more and more important for legal reasons. Blank for country of origin isn't cutting it any more. If outside national jurisdiction, this needs to be positively asserted as an aspect of quality for multiple use cases.

Tasilee commented 1 month ago

I agree @chicoreus. Given your second point above, should we be aspirational here (and #98) by adding something like INTERNAL_PREREQUISITES_NOT_MET if dwc:country equals "high seas" or contains "seas" or something that won't find a country match... and, "XZ" in the case of #98?

chicoreus commented 1 month ago

On Thu, 26 Sep 2024 14:58:09 -0700 Lee Belbin @.***> wrote:

I agree @chicoreus. Given your second point above, should we be aspirational here (and #98) by adding something like INTERNAL_PREREQUISITES_NOT_MET if dwc:country equals "high seas" or contains "seas" or something that won't find a country match... and, "XZ" in the case of #98?

I think it is a question of what the aspirational target is.

There's a good case for:

dwc:countryCode=XZ

We could advocate for dwc:country="High Seas", but there's a weaker case for that.

There is probably a good case for:

dwc:countryCode="XZ", dwc:country=""

In which case we'd want validation country not empty to include dwc:countryCode as an information element consulted and assert (probably) compliant if dwc:countryCode contained XZ, even if dwc:country was empty. That feels to me like the most defensible position to take, though we could also argue for:

dwc:countryCode="XZ", dwc:country="High Seas".

In #21 we are asserting "Non-country information such as "high seas" will fail this test (High Seas should use dwc:countryCode = "XZ" and have dwc:country empty)" in the notes.

-Paul

Tasilee commented 1 month ago

Thanks @chicoreus - I think we are pretty much aligned. How about changing the Expected Response from

COMPLIANT if dwc:country is bdq:NotEmpty; otherwise NOT_COMPLIANT

to

COMPLIANT if dwc:country is bdq:NotEmpty or dwc:countryCode has a value of "XZ" and either dwc:country is bdq:Empty or has a value of "High seas"; otherwise NOT_COMPLIANT ?

I will also changes the Notes for both issues to something like

Country is expected to be either bdq:Empty or ideally have a value of "High seas" or an agreed equivalent if material comes from the high seas, or from those portions of Antarctica outside of any sovereign nation.

I will also, if agreed on the ER, add dwc:countryCode to the Information Elements Consulted.