tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_SUPERFAMILY_FOUND #206

Closed chicoreus closed 9 months ago

chicoreus commented 1 year ago
TestField Value
GUID 2a45e0e9-446c-429f-992d-c3ec1d29eebb
Label VALIDATION_SUPERFAMILY_FOUND
Description Does the value of dwc:superfamily occur at rank of Superfamily in bdq:sourceAuthority?
TestType Validation
Darwin Core Class Taxon
Information Elements ActedUpon dwc:superfamily
Information Elements Consulted
Expected Response EXTERNAL_PREREQUISITES_NOT_MET if the bdq:sourceAuthority is not available; INTERNAL_PREREQUISITES_NOT_MET if dwc:superfamily is bdq:Empty; COMPLIANT if the value of dwc:superfamily is found as a value at the rank of superfamily in the bdq:sourceAuthority; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions SUPERFAMILY_FOUND
Parameter(s) bdq:sourceAuthority
Source Authority bdq:sourceAuthority default = "GBIF Backbone Taxonomy" [https://doi.org/10.15468/39omei]
API endpoint [https://api.gbif.org/v1/species?datasetKey=d7dddbf4-2cf0-4f39-9b2a-bb099caae36c&name=]
Specification Last Updated 2023-09-22
Examples [dwc:superfamily="Muricoidea": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:superfamily has an equivalent at the rank of Superfamily in the bdq:sourceAuthority"]
[dwc:superfamily="Metazoa": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:superfamily does not strictly have an equivalent at the rank of Superfamily in the bdq:sourceAuthority"]
Source
References
Example Implementations (Mechanisms) Kurator/FilteredPush sci_name_qc Library
Link to Specification Source Code https://github.com/FilteredPush/sci_name_qc/blob/v1.1.2/src/main/java/org/filteredpush/qc/sciname/DwCSciNameDQ.java#L3776
Notes The purpose of this test is to check whether the value is a name that is a result of a nomenclatural act at this rank. This excludes unpublished names, misspellings and vernacular names. The same test might return distinct results when using distinct source authorities. This bdq:Supplementary test is not regarded as CORE (cf. bdq:CORE) because of one or more of the reasons: not being widely applicable; not informative; not straightforward to implement or likely to return a high percentage of either bdq:COMPLIANT or bdq:NOT_COMPLIANT results (cf bdq:Response.result). A Supplementary test may be implemented as CORE when a suitable use case exists.
chicoreus commented 1 year ago

Needs review. Parallels other validations that evaluate single higher rank terms. Entailed by https://github.com/tdwg/dwc/issues/65

Tasilee commented 1 year ago

Are the tests for superfamily, tribe and subtribe CORE? We don't test for dwc:subfamily or dwc:subgenus.

ArthurChapman commented 1 year ago

@Tasilee

They are probably simple new tests to add - but like you I would query if they are CORE - perhaps Tribe but not the other two.

There will be a more limited set of databases that use these levels and many of those will not be critical use but just pulled from some external resource. Also not sure how comprehensive the bdq:sourceAuthority will be for the terms at those level and thus lots of fails because of the inadequacy of the sourceAuthority. Not sure that even a test for NOTEMPTY is of a lot of value.

chicoreus commented 1 year ago

On Mon, 03 Jul 2023 17:05:15 -0700 Lee Belbin @.***> wrote:

Are the tests for superfamily, tribe and subtribe CORE? We don't test for dwc:subfamily or dwc:subgenus.

That's exactly why I put them up for discussion. Use tends to vary by discipline.

chicoreus commented 1 year ago

After discussion, marking this as NOT CORE. This is the position we have taken for similar tests for other sub/super ranks.

Also, appears that a current implementation against the desired default bdq:sourceAuthority of GBIF's backbone taxonomy would return NOT_COMPLIANT for any non-empty dwc:superfamily values, as GBIF does not appear to have subtribe data in the backbone taxonomy.

ArthurChapman commented 1 year ago

Splitting Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Changed "Field" to "TestField", "Output Type" to "TestType", deleted "Warning Type" and updated "Specification Last Updated"