TG2-ISSUE_VERBATIMLONGITUDE_NOTEMPTY

Tasilee commented 9 months ago

TestField	Value
GUID	6e62e8e3-cffd-4afe-957c-5b200a6eaa4e
Label	ISSUE_VERBATIMLONGITUDE_NOTEMPTY
Description	Is there a value in dwc:verbatimLongitude?
TestType	Validation
Darwin Core Class	Location
Information Elements ActedUpon	dwc:verbatimLongitude
Information Elements Consulted
Expected Response	POTENTIAL_ISSUE if dwc:verbatimLongitude is bdq:NotEmpty; otherwise NOT_ISSUE
Data Quality Dimension	Completeness
Term-Actions	VERBATIMLONGITUDE_NOTEMPTY
Parameter(s)
Source Authority
Specification Last Updated	2024-04-02
Examples	[dwc:verbatimLongitude="147 16 09.0E": Response.status=RUN_HAS_RESULT, Response.result=POTENTIAL_ISSUE, Response.comment="dwc:verbatimLongitude is bdq:NotEmpty"]
	[dwc:verbatimLongitude="": Response.status=RUN_HAS_RESULT, Response.result=NOT_ISSUE, Response.comment="dwc:verbatimLongitude is bdq:Empty"]
Source	TG2
References
Example Implementations (Mechanisms)
Link to Specification Source Code
Notes	This bdq:Supplementary test is not regarded as CORE (cf. bdq:CORE) because of one or more of the reasons: not being widely applicable; not informative; likely to return a high percentage of either bdq:ISSUE or bdq:NOT_ISSUE results (cf. bdq:Response.result). A Supplementary test may be implemented as CORE when a suitable use case exists.

chicoreus commented 9 months ago

Needs further consideration. Georeferences where the original data is textual, UTM coordinates, PLSS coordinates, other coordinate systems other than latitude/longitude are expected to have no value in dwc:verbatimLatitude. This probably makes this test DO NOT IMPLEMENT as it has very low power for assessing data quality except in limited cases where all data were originally latitude/longitude.

chicoreus commented 9 months ago

We probably want to collect the verbatim notempty tests into a single UseCase, not simply put them out as supplementary. Substantial more thought is needed on what the data quality needs are that would make us data is unfit for use when verbatim data are not present.

ymgan commented 9 months ago

This probably makes this test DO NOT IMPLEMENT as it has very low power for assessing data quality except in limited cases where all data were originally latitude/longitude.

Thank you Paul! I agree with you that it is not directly impacting the data quality assessment and I have a slightly different perspective in this. Many datasets that we received have latitude and longitude in degree, minutes, seconds or degree and decimal minutes. Very often I or the data provider need to do the extra step of conversion into decimal degrees. We keep the originally recorded latitude and longitude in verbatimLatitude and verbatimLongitude fields. I think having this NOTEMPTY test can still alert the user that there is a value in verbatimLongitude. If the user suspects that there is a conversion error in decimalLongitude, verbatimLongitude NOTEMPTY test can help to prompt the user to double check the decimalLongitude value with verbatimLongitude (with other fields like verbatimSRS of course).

Tasilee commented 9 months ago

Thanks @ymgan. Your use case likely exemplifies why this test was considered previously. As @chicoreus has pointed out though, it is the interpretation of the NOT_COMPLIANT response that could be ambiguous as it stands: Not having a value doesn't necessarily imply lack of 'quality'.

@chicoreus - I feel that there is too much variation among the VERBATIM tests to gain value from combining them. I'd prefer a consistent strategy for the coordinate-related tests. @ymgan's use case is one of many that we may need to consider, or we could simply say (as @chicoreus said above) that there appears to be little to gain from this and related tests (without a lot of work/complexity) so DO NOT IMPLEMENT. Also, as mentioned (#253?) - maybe a tweak can render it potentially useful.

Tasilee commented 9 months ago

This test is similar to #247, #248, #249, #251 and as I commented on #247, all these should be Supplementary (and Closed) for now as they are "not informative" as they stand, and I would not like us to be further distracted (q.v. @chicoreus comment on similar tests "...requires substantial more thought is needed..."). There could well be use cases, and we have commented adequately to inform future use.

chicoreus commented 7 months ago

@Tasilee " it is the interpretation of the NOT_COMPLIANT response that could be ambiguous as it stands: Not having a value doesn't necessarily imply lack of 'quality'." That is not correct. Under the framework, NOT_COMPLIANT, explicitly means that the data lack quality.

I don't think we can punt on this set of verbatim tests. We either need to rephrase them as issues, where we can assert potential issue if the term under test is EMPTY, or we can group them into their own use case, but the current status of these tests isn't adequate for future workers.

Given @ymgan 's comment, I'd advocate for changing these to issues, that would parallel the quality control needs she is describing.

chicoreus commented 7 months ago

That works nicely.

tdwg / bdq

TG2-ISSUE_VERBATIMLONGITUDE_NOTEMPTY #252