tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-VALIDATION_MINELEVATION_INRANGE #39

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID 0bb8297d-8f8a-42d2-80c1-558f29efe798
Label VALIDATION_MINELEVATION_INRANGE
Description Is the value of dwc:minimumElevationInMeters within the Parameter range?
TestType Validation
Darwin Core Class dcterms:Location
Information Elements ActedUpon dwc:minimumElevationInMeters
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:minimumElevationInMeters is bdq:Empty or the value is not a number; COMPLIANT if the value of dwc:minimumElevationInMeters is within the range of bdq:minimumValidElevationInMeters to bdq:maximumValidElevationInMeters inclusive; otherwise NOT_COMPLIANT
Data Quality Dimension Conformance
Term-Actions MINELEVATION_INRANGE
Parameter(s) bdq:minimumValidElevationInMeters
bdq:maximumValidElevationInMeters
Source Authority bdq:minimumValidElevationInMeters default = "-430"
bdq:maximumValidElevationInMeters default = "8850"
Specification Last Updated 2023-09-17
Examples [dwc:minimumElevationInMeters="0": Response.status=RUN_HAS_RESULT, Response.result=COMPLIANT, Response.comment="dwc:minimumElevationInMeters is IN_RANGE"]
[dwc:minimumElevationInMeters="-500": Response.status=RUN_HAS_RESULT, Response.result=NOT_COMPLIANT, Response.comment="dwc:minimumElevationInMeters is NOT_IN_RANGE (<-430)"]
Source ALA, GBIF
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes We have rounded up the Parameter values. We are aware of sub-ice elevations in Antarctica to -3,500m and possible sampling in the atmosphere above the elevation of the top of Mt Everest that would fail this test but we support the odd false positive.
iDigBioBot commented 6 years ago

Comment by Paula Zermoglio (@pzermoglio) migrated from spreadsheet: Elevation CAN be less than 0, and I would not expect it to be more than 8,848m (Mt. Everist summit)

ArthurChapman commented 6 years ago

Comments from Gainesville: Need to split into two tests

ianengelbrecht commented 5 years ago

'if the value of the field dwc:maximumElevationInMeters is number between -423 and 8850 inclusive' -should that be dwc:minimumElevationInMeters? Also is there a reason for -423 as lower bound? I understand, possibly incorrectly, that the lowest is the Dead Sea shore at -413m.

ArthurChapman commented 5 years ago

Thanks @ianengelbrecht - yes should be minimum and not maximum (I have fixed). Thanks for picking that up. I see that the reference we cite https://en.wikipedia.org/wiki/List_of_elevation_extremes_by_country has been updated and the Dead Sea has been changed from -423 to -428 (perhaps other references give different values). I have corrected that in the expected response above. Should we alter this to -430 (-450) or something to allow rounding. What do you think @tucotuco

ArthurChapman commented 5 years ago

@tucotuco I see that the reference we cite has Mt Everest at 8848 m. We seem to have rounded that to 8850, so perhaps we should round the minimum to -430 especially as the Wikipedia footnote (34) states that it falls about 1 m per year and gives -428 as the 2014 level. I am also wondering if Wikipedia is the best reference here, or a more permanent reference?

Tasilee commented 5 years ago

Just back on deck. Thanks @ianengelbrecht and @ArthurChapman. I agree that we should round these limits out to -430 to +8850, but I wonder then if this one is 'Parameterized' so that installers/implementers could provide their own min/max?

ArthurChapman commented 5 years ago

Interesting suggestion, @Tasilee. If it was parameterized (for example if someone was working in Australia and wanted to set the maximum and minimum for Australia) that would make sense. We would then default to -430 to +8850 as the default if parameter not set. Would need to rewrite and put a note to the effect that if the parameter is not set it reverts to -430 to +8850

tucotuco commented 5 years ago

I like the idea of rounding a bit beyond the extremes. It will avoid discussions (hopefully) of what different sources have to say.

I really like the idea of a parametrized test. We could create a reference data set of extremes by country code as well.

chicoreus commented 5 years ago

Data Quality Dimension isn't correct. This is a test of Likelihood, not Conformance, as there is no standard to conform to, just the distribution of likelyhood of occurrences against elevations within some region of interest.

We should check all tests that assert a data quality dimension of Conformance to make sure that they do actually involve the data conforming to some standard rather than being a test of likely values in the real world.

ArthurChapman commented 5 years ago

I notice that the Expected Response doesn't include the possibility of Paramaterization as discussed in comments above. Probably needs rewriting to allow for Paramaterization

Tasilee commented 5 years ago

I agree with @chicoreus and amended Dimension and will check Conformance on others next.

@ArthurChapman: How do we refer to Parameters? Maybe like this: INTERNAL_PREREQUISITES_NOT_MET if the field dwc:minimumElevationInMeters is not present, is EMPTY or is not a number; COMPLIANT if the value of the field dwc:minimumElevationInMeters conforms to Parameters; otherwise NOT_COMPLIANT ? This would reduce edits.

Tasilee commented 5 years ago

Do all agree that #24, #107 (and this one) should be Data Quality Dimension = Likelihood?

What about #108? I presume Conformance.

ArthurChapman commented 5 years ago

@Tasilee I think your wording makes sense and would lead to consistency throughout - especially if we then use a namespace for the Parameters. How does this fit with the coding @chicoreus

Not sure about #24 and #108 being Likelihood - it is Conformance - because the minimum elevation cannot ever be greater than maximum elevation and same for depth. @chicoreus? I am not sure you are correct wrt to Conformance and Range - definition of Conformance "From the FFU Framework: A Data Quality Dimension (q.v.) - conforms to a format, syntax, type, range, standard, or to the own nature of the Information Element (q.v.)." NB "range" in the definition. Likelliness: From the FFU Framework: "A Data Quality Dimension (q.v.) - probability of data having the expected value; the likelihood of a data having true values rather than having false values".

Tasilee commented 5 years ago

As we are providing elevation limits, the result either conforms to this range or it does not. There is no likeliness about it.

ArthurChapman commented 2 years ago

Added "of bdq:minimumValidElevationInMeters to bdq:maximumValidElevationInMeters inclusive" in COMPLIANT for consistency with other related tests

Tasilee commented 1 year ago

Edited Parameter(s) and Source authority to align with proposed structure and format

Tasilee commented 1 year ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted"