tdwg / bdq

Biodiversity Data Quality (BDQ) Interest Group
https://github.com/tdwg/bdq
43 stars 7 forks source link

TG2-AMENDMENT_EVENTDATE_STANDARDIZED #61

Open iDigBioBot opened 6 years ago

iDigBioBot commented 6 years ago
TestField Value
GUID 718dfc3c-cb52-4fca-b8e2-0e722f375da7
Label AMENDMENT_EVENTDATE_STANDARDIZED
Description Proposes an amendment of the value of dwc:eventDate to a valid ISO date.
TestType Amendment
Darwin Core Class dwc:Event
Information Elements ActedUpon dwc:eventDate
Information Elements Consulted
Expected Response INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is bdq:Empty; AMENDED if the value of dwc:eventDate is not a properly formatted ISO 8601 date but is unambiguous, and altered to be a valid ISO 8601 date; otherwise NOT_AMENDED
Data Quality Dimension Conformance
Term-Actions EVENTDATE_STANDARDIZED
Parameter(s)
Source Authority
Specification Last Updated 2024-09-16
Examples [dwc:eventDate="2021-28-10": Response.status=AMENDED, Response.result=dwc:eventDate="2021-10-28", Response.comment="dwc:eventDate contains an interpretable value. Assuming year-day-month input format"]
[dwc:eventDate="10-28": Response.status=NOT_AMENDED, Response.result=, Response.comment="dwc:eventDate contains an ambiguous value"]
Source Paul Morris, Lee Belbin
References
Example Implementations (Mechanisms) FilteredPush/Kurator:event_date_qc 10.5281/zenodo.596795.
Link to Specification Source Code event_date_qc DwCEventDQ.amendmentEventdateStandardized() A minimal set of unit tests is in DwCEventDQTestDefinitions unit tests for the underlying verbatim date extraction code are in DateUtilsTest and DateUtilsTest
Notes The intent of the amended range is to capture the original uncertainty where possible. As in the example, we amend "1999-11" instead of "1999-11-01/1999-11-31". An AMBIGUOUS response is possible.
iDigBioBot commented 6 years ago

Comment by Lee Belbin (@Tasilee) migrated from spreadsheet: Response to PJM comments (#6a) on dateModified equivalent for eventDate - and scores copied from 6a for consistency (no gaps)

chicoreus commented 6 years ago

event_date_qc has a test with a guid 134c7b4f-1261-41ec-acb5-69cd4bc8556f referencing EVENTDATE_FORMAT_CORRECTION, need to confirm the correct GUID for this test.

chicoreus commented 6 years ago

I don't agree with the note. "1999-11" and "1999-11-01/1999-11-31" are both valid ISO formatted dates representing the interval of time of November 1999. They have identical meanings, and are not different representations of precision (at least as far as I've been able to tell from the publicly available information on the ISO standard.

chicoreus commented 6 years ago

Specification needs to describe how to handle ambiguity. 02/03/1884 is an ambiguous date and result should include a result state of AMBIGUOUS when providing an ISO format for such values, or the specification should provide limits on what forms of date are to be conformed with the ISO standard.
For maximum utility (and consistency), I would recommend that this test support the same set of date recognition and translation to ISO format as #86 TG2-AMENDMENT_EVENTDATE_FROM_VERBATIM, except for taking the value of eventDate as input.

ArthurChapman commented 6 years ago

As in the note I left at https://github.com/tdwg/bdq/issues/86 to change the Tests Prerequisites to read ""The field dwc:eventDate is not EMPTY and is unambiguously interpretable as an ISO 8601:2004(E) date"

Tasilee commented 6 years ago

I agree with @ArthurChapman and again note we need to consider responses from tests (as note in #86)

Tasilee commented 2 years ago

"Original was "unambiguously conform..." while I think it should read

"AMENDED if the value of dwc:eventDate was changed to conform to an unambiguous valid ISO 8601-1:2019 date.."

chicoreus commented 2 years ago

I think the formulation "unambiguously conform" is correct, it tells the implementer that the act of making the data conform needs to unambiguously produce a single result, while the "unambiguous valid" simply reads as redundant for "valid", and and is saying that the result must unambiguously be a valid date, rather than saying that if there is the potential for more than one valid result, that path can't be taken.

ArthurChapman commented 2 years ago

An ISO value can't be ambiguous can it? I think "unambiguously conform..." is OK

Tasilee commented 2 years ago

OK, done

Tasilee commented 1 year ago

Due to recent discussions, I have changed

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it can be unambiguously conformed to bdq:sourceAuthority; otherwise NOT_AMENDED

to

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it can be unambiguously conformed to ISO 8601-1:2019; otherwise NOT_AMENDED

...and removed the reference to bdq:sourceAuthority

tucotuco commented 1 year ago

Again picky, but I would say:

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it can be unambiguously transformed to ISO 8601-1:2019; otherwise NOT_AMENDED

Tasilee commented 1 year ago

Edited accordingly

Tasilee commented 1 year ago

I've edited the Expected Response according to @tucotuco suggestion:

From

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it can be unambiguously transformed to ISO 8601-1:2019; otherwise NOT_AMENDED

to

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was unambiguously formatted as a valid ISO 8601-1 date; otherwise NOT_AMENDED |

and updated the References

ArthurChapman commented 1 year ago

See my comment under #26 Would the following be better?

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was unambiguous by formatting as a valid ISO 8601-1 date; otherwise NOT_AMENDED

Tasilee commented 1 year ago

I have updated the Expected Response as suggested above and the ISO Reference.

ArthurChapman commented 1 year ago

I think the Expected Response should be similar to #26 i.e.

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was unambiguous and formatted as a valid ISO 8601-1 date; otherwise NOT_AMENDED

chicoreus commented 1 year ago

Like #26 the language has become unclear for implementors: "AMENDED the value of dwc:eventDate if it was unambiguous and formatted as a valid ISO 8601-1 date" is read as formatted meaning "was already in the form", not meaning that it was changed. We need to revert to language from a prior version that is explicit about AMENDED being when the data are changed.

Propose changing from:

"INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was unambiguous and formatted as a valid ISO 8601-1 date; otherwise NOT_AMENDED"

To:

"INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was not a properly formatted ISO 8601-1 date but was unambiguous and was altered to be a valid ISO 8601-1 date; otherwise NOT_AMENDED"

chicoreus commented 1 year ago

Both here and in #26 we want to amend if:

  1. the existing value was not a correctly formatted data
  2. the existing value was unambiguous (e.g. not a 2 digit year, not nn-nn-yyyy where nn are both below 12).
  3. the value could be transformed into an ISO date.
Tasilee commented 1 year ago

Again, I think the Expected Response may be clearer if your

"INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was not a properly formatted ISO 8601-1 date but was unambiguous and was altered to be a valid ISO 8601-1 date; otherwise NOT_AMENDED"

is rendered as

"INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was not a properly formatted ISO 8601-1 date but was unambiguously interpreted as a valid ISO 8601-1 date; otherwise NOT_AMENDED"

Tasilee commented 1 year ago

In line with #26, altered Expected response from

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED the value of dwc:eventDate if it was unambiguous and formatted as a valid ISO 8601-1 date; otherwise NOT_AMENDED

to

INTERNAL_PREREQUISITES_NOT_MET if dwc:eventDate is EMPTY; AMENDED if the value of dwc:eventDate was not a properly formatted ISO 8601-1 date but was unambiguous, and was altered to be a valid ISO 8601-1 date; otherwise NOT_AMENDED

Tasilee commented 1 year ago

Splitting bdqffdq:Information Elements into "Information Elements ActedUpon" and "Information Elements Consulted".

Also changed "Field" to "TestField", "Output Type" to "TestType" and updated "Specification Last Updated"