tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

Generate MultiRecord Measures for QualityAssurance that return COMPLETE/NOT_COMPLETE for compliant and prerequisites not met. #297

Open chicoreus opened 2 months ago

chicoreus commented 2 months ago

For a set of Validations, listed below, generate a Measure that operates on a MultiRecord and returns a Response.result of COMPLETE if all records in the MultiRecord have a Response.result of COMPLIANT, or have a Response.status=INTERNAL_PREREQUISITES_NOT_MET for a particular Validation.

These tests handle cases where empty values don't prevent the data from having use for the specified UseCase.

Template for these would look like this, one for each Validation, specified as {Validation}: Generate from the bdqffdq compliant template instead, to match TG2_tests.csv columns. We don't need to add issues for each, but can track rationale management for this set of tests here.

These Measures expect the Validations to complete to Response.status=RunHasResult and Response.result=COMPLIANT, but also allow for cases where some prerequisite information elements are empty and the Response.status=INTERNAL_PREREQUISITES_NOT_MET for data to have quality (as in validations of terms that may be empty and the data still have quality, such as validation of dwc:family for an identification to a rank above Family). See #295 for QA measures where only Response.result=COMPLIANT are allowed for data to still have quality.

TestField Value
GUID Generate for each.
Label MULTIRECORD_MEASUREQA{Validation.Term-Actions}
Description Measure if all {Validation} in a record set are COMPLIANT or INTERNAL_PREREQUISITES_NOT_MET (indicating some empty value)
TestType Measure
Darwin Core Class {Validation.Darwin Core Class}
Information Elements ActedUpon {Validation}.Response
Information Elements Consulted
Expected Response COMPLETE if every {Validation} in the MultiRecord has Response.result=COMPLIANT or Response.status=INTERNAL_PREREQUISITES_NOT_MET, otherwise NOT_COMPLETE.
Data Quality Dimension {Validation.Dimension}
Term-Actions {Validation.Term-Actions}
Parameter(s)
Source Authority
Specification Last Updated Generate
Examples
Source TG2
References
  • Veiga AK, Saraiva AM, Chapman AD, Morris PJ, Gendreau C, Schigel D, & Robertson TJ (2017) A conceptual framework for quality assessment and management of biodiversity data. PLOS ONE 12 (6): e0178731. https://doi.org/10.1371/journal.pone.0178731
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes For Quality Assurance, filter record set until this measure is COMPLETE.
chicoreus commented 2 months ago

Per @Tasilee a list of these tests:

Label GUID
MULTIRECORD_MEASURE_QA_CLASS_FOUND 21541436-641d-45a9-938c-537484d94eb7
MULTIRECORD_MEASURE_QA_COORDINATES_COUNTRYCODE_CONSISTENT d105bb0f-ec58-47d3-92f7-7d031f13534f
MULTIRECORD_MEASURE_QA_COORDINATES_STATEPROVINCE_CONSISTENT c87aac27-bee7-45cf-b75c-e5a2d40b28e5
MULTIRECORD_MEASURE_QA_COORDINATES_TERRESTRIALMARINE 7e0f4e97-acae-466a-a9bf-c31956a85b4f
MULTIRECORD_MEASURE_QA_COORDINATEUNCERTAINTY_INRANGE d94b0130-7a13-4fa8-955c-eff5c47bd9de
MULTIRECORD_MEASURE_QA_COUNTRY_COUNTRYCODE_CONSISTENT 73fd9f74-7a81-4938-a51f-935d0786092d
MULTIRECORD_MEASURE_QA_COUNTRYCODE_STANDARD fedf27b2-e01d-459f-98fc-7f0f39e3d4be
MULTIRECORD_MEASURE_QA_DAY_INRANGE 85dc5d02-9847-420f-a026-6a0e70962d2a
MULTIRECORD_MEASURE_QA_DAY_STANDARD 371035f6-42b5-494f-86d9-de2f44a6cdc6
MULTIRECORD_MEASURE_QA_DEGREEOFESTABLISHMENT_STANDARD ba953672-6526-4375-97e8-b4e9d1a7d3a0
MULTIRECORD_MEASURE_QA_ENDDAYOFYEAR_INRANGE c04d428a-13d0-4766-9df7-4dfb2ef5d5d8
MULTIRECORD_MEASURE_QA_ESTABLISHMENTMEANS_STANDARD 8dfed701-01a9-415d-a9f8-539280b75975
MULTIRECORD_MEASURE_QA_FAMILY_FOUND a07d7147-2db8-48ce-81b8-e47595ad5f17
MULTIRECORD_MEASURE_QA_GENUS_FOUND c5c8db83-3af0-4215-806f-e2f90574b138
MULTIRECORD_MEASURE_QA_MAXDEPTH_INRANGE c73d49d1-06e4-4272-8249-6a26e7f8abca
MULTIRECORD_MEASURE_QA_MAXELEVATION_INRANGE 7c5a6ba0-399d-4570-878a-4a064e2406fe
MULTIRECORD_MEASURE_QA_MINDEPTH_INRANGE 49d756a8-e654-4267-a290-d1def5d2c5f9
MULTIRECORD_MEASURE_QA_MINDEPTH_LESSTHAN_MAXDEPTH fcabd2c9-392c-4841-a5d7-e2680c9587ab
MULTIRECORD_MEASURE_QA_MINELEVATION_INRANGE 1ba18c8b-66a6-47d9-a709-404439332dba
MULTIRECORD_MEASURE_QA_MINELEVATION_LESSTHAN_MAXELEVATION 44f00697-ca66-43cf-8f16-646b40c0f514
MULTIRECORD_MEASURE_QA_MONTH_STANDARD b3c2bb86-e239-4532-ae0c-b121ec1ee025
MULTIRECORD_MEASURE_QA_NAMEPUBLISHEDINYEAR_NOTEMPTY 16059801-6adb-4e65-82f4-61eaa70d8df0
MULTIRECORD_MEASURE_QA_ORDER_FOUND 773bb288-fef3-4968-a65a-a69c74d6ecb5
MULTIRECORD_MEASURE_QA_PATHWAY_STANDARD ef31ba02-cea7-4d76-990f-99ebbd201fb4
MULTIRECORD_MEASURE_QA_PHYLUM_FOUND 17364d16-37b7-4ccb-9614-bfb95ff1bca9
MULTIRECORD_MEASURE_QA_POLYNOMIAL_CONSISTENT ef05b45b-13b8-4877-9e9d-fa44b332d83c
MULTIRECORD_MEASURE_QA_SEX_STANDARD 1b3bbac4-7c00-46d6-8179-1d57c92374ad
MULTIRECORD_MEASURE_QA_STARTDAYOFYEAR_INRANGE 8c217eee-9cd0-48c3-aea0-90151c6c5bfd
MULTIRECORD_MEASURE_QA_STATEPROVINCE_FOUND 9c1cdf6a-0dbf-4828-a2e3-fb368f74d194
chicoreus commented 2 months ago

CSV file of tests:

https://github.com/tdwg/bdq/blob/master/tg2/core/TG2_multirecord_measure_tests.csv

Human readable markdown:

https://github.com/tdwg/bdq/blob/master/tg2/core/generation/docs/core_multirecord_measure_tests.md

ArthurChapman commented 2 months ago

Fixed several cases where PREREQUISITES was misspelled.