AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

Data load: City of Gold Coast Koala Sightings update #1017

Closed timhicks-ala closed 2 weeks ago

timhicks-ala commented 5 months ago

Ticket: https://support.ehelp.edu.au/a/tickets/192972

DR: https://collections.ala.org.au/dataResource/show/dr16872

cha801p commented 5 months ago

Ticket Update: January 22, 2023 (6:30 PM)

Issue: City of Gold Coast Koala Sightings data update (dr16872)

Actions Taken:

Problems Encountered:

  1. Data load failed on Databox in the first run - While attempting to load data into our test system, an “override percentage check” flag was triggered. This indicates that over 50% of the UUIDs have been altered compared to the previous load. Upon inspecting the catalog numbers, I observed that the format remains consistent, ranging from 1 to 12914. Therefore, it seemed that the order of the occurrence data has been modified while maintaining the same catalog numbers.
  2. Upon further investigation it was discovered that catalogNumber in newly loaded csv was formatted as float and hence had a decimal point causing the above issue.
  3. This issue was resolved and the following actions were takedn:

Logs: INFO [2024-01-22 06:27:50,941+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Writing metrics..... INFO [2024-01-22 06:27:50,942+0000] [main] org.gbif.pipelines.common.beam.metrics.MetricsHandler: Trying to write pipeline's metadata to a file - /data/pipelines-data/dr16872/1/uuid-metrics.yml INFO [2024-01-22 06:27:50,943+0000] [main] org.gbif.pipelines.common.beam.metrics.MetricsHandler: Added pipeline metadata - preservedUuidsAttempted: 9223, newUuidsAttempted: 3691

Expected Count: 12,914 records

cha801p commented 5 months ago

Ticket Update: January 22, 2023 (9 AM) Data Loaded Successfully.

cha801p commented 5 months ago

There was an issue identified with dates by the data provider - Fixed the dates format and reloaded the data. This has been communicated with the data provider on 30 Jan 2024 at 2:14 PM Data re-loaded Successfully