Closed danstoner closed 1 year ago
Bounds were fixed, but data flag was not! Fix needed
Well, this is a unit test to check if datecollected references the flag 'datecollected_bounds' so when setFlags() is run here it simply returns ['datecollected_bounds'] which is the assertion being made but isn't actually testing whether or not that date is within the actual bounds. You can set the flag to datetime.date(1,1,2) in this unit test and it will still return ['datecollected_bounds'] based on the setFlags() function and pass. The flag itself is being set to the function checkBounds in conversions: https://github.com/iDigBio/idb-backend/blob/4c5a66886a2d28cbe90748f980d319c0372547ab/idb/helpers/conversions.py#L239
So do flags just exist on specimen that haven't been ingested since the change?
For example: https://portal.idigbio.org/portal/records/333b5fd2-2d0f-43a7-9111-8274e12e6d11
There are 276,015 more that have this flag but occur between 1500-01-01 and 2024-01-01.
I see what you are getting at here and the answer for a lot of questions when it comes to the data quality versus the code is going to be "it depends". It depends on a lot of factors like if since this was ingested was a full index built and whether or not this recordset was paused or not and if maybe there were changes to the system which could account for that, and so on. On the actual example you gave it still says: "Date Collected out of bounds (Not between 1700-01-02 and the date of Indexing). Date Collected is generally composed from dwc:year, dwc:month, dwc:day or as specified in dwc:eventDate." That's telling me that the actual pages may need to be updated for this as well. I will be sure to discuss this with team members who have been with the project longer than I have.
As reported in https://github.com/iDigBio/idb-backend/issues/229 the oldest natural history records actually date back to the 1500s rather than 1700.
I have confirmed there are some digitized records in GBIF from the 1600s.
Moving our zero date back a few hundred years.