tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_SUBGENUS_NOTEMPTY #265

Closed Tasilee closed 3 months ago

Tasilee commented 9 months ago
TestField Value
GUID b57d043b-cd94-4aba-b375-d7ee8988915c
Label VALIDATION_SUBGENUS_NOTEMPTY
Description Is there a value in dwc:subgenus?
TestType Validation
Darwin Core Class dwc:Taxon
Information Elements ActedUpon dwc:subgenus
Information Elements Consulted
Expected Response COMPLIANT if dwc:subgenus is bdq:NotEmpty; otherwise NOT_COMPLIANT
Data Quality Dimension Completeness
Term-Actions SUBGENUS_NOTEMPTY
Parameter(s)
Source Authority
Specification Last Updated 2024-02-07
Examples [dwc:subgenus="strobus": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:subgenus is bdq:NotEmpty"]
[dwc:subGenus="": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:subgenus is bdq:Empty"]
Source TG2
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes Limited uses, most taxa are not placed into subgenera. This bdq:Supplementary test is not regarded as CORE (cf. bdq:CORE) because of one or more of the reasons: not being widely applicable; not informative; not straightforward to implement or likely to return a high percentage of either bdq:COMPLIANT or bdq:NOT_COMPLIANT results (cf bdq:Response.result). A Supplementary test may be implemented as CORE when a suitable use case exists.
ArthurChapman commented 9 months ago

Corrected dwc:subGenus to dwc:subgenus throughout

chicoreus commented 9 months ago

Another test that will return not compliant for large number of perfectly correct and usable records. Uses of this test are very limited. It might be saved by including dwc:scientificName as an information element consulted, and if dwc:scientificName contains a subgenus, then this test could assert that dwc:subgenus should contain a value, or if an external authority is consulted and it places the dwc:scientificName in a subgenus. Without changes to consider whether there should be a value in dwc:subgenus this test should be do not implement.

ArthurChapman commented 9 months ago

@chicoreus - I understand your argument, but surely this falls right into the Supplementary definition. We don't consult other elements in NOTEMPTY tests and what you are suggesting would be another test type. You are right that this test holds little value as is, but like DAY_EMPTY my have value when there are a suite of tests on SUBGENUS. I think we include these as Simple NOTEMPTY Supplementary tests - perhaps with additional wordage in the notes - but at least there are words attached in your comments.

chicoreus commented 9 months ago

@ArthurChapman yes, it falls into supplementary, but we need to provide supplementary test definitions that would have utility. Multiple of the latest set of supplemental tests make very little sense when testing a single term for a value in isolation of consideration of other terms. There are a number of these new supplementary tests that would make a great deal of sense if they include additional information elements consulted to assess whether it is sensible for the term under test, the acted upon term, should have a value or not. This is one of these tests, it has power for assessing data quality in a sparsely populated term when other terms contain values indicating that this term should contain a value. Not arguing that this goes into core, but that we need to more carefully define a number of the supplemental tests so that they will have utility.

Tasilee commented 9 months ago

Would you be happy @chicoreus if we added something like this (or similar) to the Notes to cover potential future implementations (valid use cases)?

This Supplementary test would have a small value in evaluating 'fitness for use' in isolation from other related taxonomic terms.

chicoreus commented 9 months ago

This needs further refinement. It isn't yet in a suitable form to be considered supplementary.

It may fall into the category of immature.

In its current form it is simply not a useful test. It needs to consider other information to determine whether or not a subgenus should be present or not. This could include examining the taxonRank (if Genus or higher, then should be empty), or consulting a source authority to see if the scientificNameID or scientificName is placed within a subgenus within the classification by the authority, or consideration of the higherTaxonomy to see if a subgenus is present therein.