Open mjy opened 2 months ago
Not sure how to find this in TW. Can we post a list/JSON query? @mjy
@tmcelrath, I suggested in my report that I could tidy up the misformatted dates and give you a complete list of records with this problem, if you wanted. How I find these is explained on my "Darwin Core checker" website here
@Mesibov I haven't fully grokked this issue, but one quick comment (also impacts lat/long export).
We have 7 date fields. verbatim_date
and then the parsed start/end. Depending on whether or not the user has parsed verbatim_date you may/not get errored data exported. There may be some true errors in verbatim_date
that are only going to get resolved in export if we parse them into the 6 field equivalent.
I'm happy to check date fields for consistency, but the issue here is a logical one. If you collected an insect in 1999 you could not have identified it in 1998.
@mjy, @tmcelrath , attached in a TSV are the 5249 date anomalies I found in which dateIdentified is earlier than the non-interval eventDate, or earlier than the "finish" date in an interval eventDate. For each record I give the "id" from the occurrence.txt table, the original eventDate, the tidied eventDate (see below), the original dateIdentified and the tidied dateIdentified (see below). I ignored the records with no year in eventDate and I haven't checked verbatimEventDate against eventDate.
A lot of these look like dateIdentified copy-down errors in a spreadsheet.
Tidying of eventDate (by example): 1875-02-06/1875-02-06 > 1875-02-06 1847-01-01/1847-12-31 > 1847 1997-01-01/1998-12-31 > 1997/1998 1877-06-01/1877-06-30 > 1877-06 1875-07-01/1875-07-31 > 1875-07
Tidying of dateIdentified (by example): 2023-5-8 > 2023-05-08 1941-1 > 1941-01
Hey @Mesibov - as with the other issue, I need a .txt file with occurrenceID instead of id.
@tmcelrath, no problem, attached has id, occurrenceID and (if available) catalogNumber. Please note that many of these cases might be due to eventDate errors arising from the verbatimEventDate-to-eventDate problems
Some of these are flagged with "Determination is preceding collection date" - we need to be able to search by that in TW.
... we need to be able to search by that in TW.
And it needs to be a hard validation.
occurrenceID | eventDate | dateIdentified 223bdd7c-8994-4a00-87a1-1858347e63c5 | 1999-06-23/1999-06-23 | 1998 94b6a73a-7de4-451f-9d41-83338b74340e | 2000-08-07/2000-08-07 | 1998 5204d8f6-efd1-400e-be2b-ad31c9f0a869 | 2000-07-12/2000-07-12 | 1998 e0b936c2-74a0-4a8c-a9f1-2f45a2c8fbfb | 2000-07-30 | 1998 cc543268-4780-41e8-99c2-fcd69127c894 | 2000-07-17/2000-07-18 | 1999 48a19e8e-d1e6-4536-bedd-200185018972 | 1999-07-26/1999-07-28 | 1998 5aba7156-ca0f-43d8-97b3-85daae13d6d6 | 1999-05-17/1999-05-17 | 1997 88034de0-ee69-4570-b875-73d4c5537aef | 1999-09-08/1999-09-10 | 1998 bfa8ef3b-63f4-48d8-b733-065a628c1ae0 | 1997-07-29/1997-07-29 | 1993 e6059991-1ca5-4585-9761-3d3289bd3333 | 1998-05-13/1998-05-14 | 1997 2b134e80-6533-4989-b4c1-94f8239cecd2 | 1999-07-22 | 1998