tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-MEASURE_VALIDATIONTESTS_PREREQUISITESNOTMET #134

Open pzermoglio opened 6 years ago

pzermoglio commented 6 years ago
TestField Value
GUID 49a94636-a562-4e6b-803c-665c80628a3d
Label MEASURE_VALIDATIONTESTS_PREREQUISITESNOTMET
Description The number of distinct VALIDATION tests that have a Response.status="EXTERNAL_PREREQUISITES_NOT_MET" or "INTERNAL_PREREQUISITES_NOT_MET" for a given record.
TestType Measure
Darwin Core Class bdq:Validation
Information Elements ActedUpon
Information Elements Consulted bdq:AllValidationTestsRunOnSingleRecord
Expected Response INTERNAL_PREREQUISITES_NOT_MET if no tests of type VALIDATION were run; Report the number of tests of output type VALIDATION that did not run because prerequisites for those tests were not met (Result.status="INTERNAL_PREREQUISITES_NOT_MET" or "EXTERNAL_PREREQUISITES_NOT_MET")
Data Quality Dimension Completeness
Term-Actions VALIDATIONTESTS_PREREQUISITESNOTMET
Parameter(s)
Source Authority
Specification Last Updated 2024-08-18
Examples [Response.status=RUN_HAS_RESULT, Response.result="27", Response.comment="27 VALIDATION tests had either INTERNAL_PREREQUISITES_NOT_MET" or "EXTERNAL_PREREQUISITES_NOT_MET"]
Source TG2-Gainesville
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes We have three individual measures for pass (MEASURE_VALIDATIONTESTS_COMPLIANT (45fb49eb-4a1b-4b49-876f-15d5034dfc73)), fail (MEASURE_VALIDATIONTESTS_NOTCOMPLIANT (453844ae-9df4-439f-8e24-c52498eca84a)), and PREREQUISITES_NOT_MET (49a94636-a562-4e6b-803c-665c80628a3d). To get the total number of tests that were attempted, add all three measures. To get the total number of tests that ran, add NOT_COMPLIANT (fail) and COMPLIANT (pass).
chicoreus commented 6 years ago

This probably doesn't fit into the framework - prerequisites not met is a result.status not a result.value, not clear if a measure can evaluate that metadata.

ArthurChapman commented 6 years ago

Might not fit the Framework, but important for the tests and fits with the other related Measures - No. of Validations tests passed, No. failed and the No. that couldn't be run because the prerequisites were not met. These are metadata on the tests, and without this one - the others make no sense.

chicoreus commented 6 years ago

@ArthurChapman Sounds like we need to file an issue against the framework for @allankv to evaluate what might be needed there to support this need. For a single record (and we should probably rename this test to include SINGLE in the name (probably important to clearly distinguish measures operating on single records and multi-records), the sum of of Problems pass, Problems fail, and Problems prerequisites not met should equal the total number of Problems tested and be consistent among records in a single data quality report (likewise for Validations COMPLIANT, Validations NOT_COMPLIANT, and Validations with prerequisites not met).

ArthurChapman commented 6 years ago

I don't think that this is a SINGLE record test. This (and the other tests mentioned above) is meant to be a count in a dataset when you run all the tests on a dataset and this is a report on the tests run on that dataset at a point in time. I think we originally called it a REPORT rather than a MEASURE.

Tasilee commented 6 years ago

I don't agree with @ArthurChapman about this test being multi-record: It is definitely single record. Like any of the assertions, they can be accumulated across any set of multiple records (or datasets etc). Like the other MEASURES, they are additive. In the case of VALIDATIONs, the result will be a count of COMPLIANT/NOT_COMPLIANT. With AMENDMENTS, I presume RUN/FAILED/...?

ArthurChapman commented 6 years ago

I agree with you @Tasilee - must have been late at night when I was responding to that - of course it is Single Record.

Tasilee commented 2 years ago

Slight tweak of Expected Response applied:

INTERNAL_PREREQUISITES_NOT_MET if no tests of type VALIDATION were run; REPORT the number of tests of output type VALIDATION that did not run because prerequisites for those tests were not met (Result.status="INTERNAL_PREREQUISITES_NOT_MET" or "EXTERNAL_PREREQUISITES_NOT_MET"); otherwise NOT_REPORTED

tucotuco commented 2 years ago

I suggest the Description:

'The number of distinct VALIDATION tests that have a Response.status="EXTERNAL_PREREQUISITES_NOT_MET" or "INTERNAL_PREREQUISITES_NOT_MET" for a given record.'

in place of:

'The number of VALIDATION type tests run on a record that have a Response.status="EXTERNAL_PREREQUISITES_NOT_MET" or "INTERNAL_PREREQUISITES_NOT_MET".'

Tasilee commented 2 years ago

From Zoom meeting 30th May 2022, change the Expected Response

INTERNAL_PREREQUISITES_NOT_MET if no tests of type VALIDATION were run; REPORT the number of tests of output type VALIDATION that did not run because prerequisites for those tests were not met (Result.status="INTERNAL_PREREQUISITES_NOT_MET" or "EXTERNAL_PREREQUISITES_NOT_MET"); otherwise NOT_REPORTED

to

INTERNAL_PREREQUISITES_NOT_MET if no tests of type VALIDATION were run; Report the number of tests of output type VALIDATION that did not run because prerequisites for those tests were not met (Result.status="INTERNAL_PREREQUISITES_NOT_MET" or "EXTERNAL_PREREQUISITES_NOT_MET")

ArthurChapman commented 1 year ago

Updated wording of Notes to be consistent with #135 and to remove internal GitHub References.

Tasilee commented 1 year ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted". This MEASURE I am unsure about: I opted for "Consulted"

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"

chicoreus commented 2 months ago

AllDarwinCoreTerms needs to be replaced by a list of relevant validations, in the form (used in the multirecord measures):

bdq:VALIDATION_BASISOFRECORD_NOTEMPTY.Response, as it is the results of validations on the single records that are the information elements for this test, not the darwin core terms.

We should probably also split this test into one test for each use case, with information elements matching the validations found in that use case.

Tasilee commented 2 months ago

I agree @chicoreus , but there are two issue. First up, we probably should not list CORE VALIDATION tests as these may change depending on context. Second, we don't agree about splitting on use case.

I have changed Information Element Consulted from "All DarinCoreTerms" to "All CORE tests of type VALIDATION that were run"

chicoreus commented 2 months ago

Good, we just need to use something machine interpretable as the specific information element. That is the advanage of listing the possible validations as information elements consulted.

We should be able to agree to a machine readable term that means all tests of a given type that were run on the record.

bdq:AllValidationsRunOnSingleRecord?

Something like that would then be flexible to use case, and would support users who include additional validations in their test suite, and would be consistent with our use of specific Darwin Core terms as information elements.

The framework allows for very generic information elements, one thing we've been doing to aid implementors is to be specific in what input terms should be bound as information elements in each test, rather than using Space/Time/etc as information elements.

Tasilee commented 2 months ago

Changed Information Elements Consulted to "bdq:AllValidationTestRunOnSingleRecord" on all relevant MEASURE type tests