AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

Data load: Top End caddisfly records #1023

Closed timhicks-ala closed 2 weeks ago

timhicks-ala commented 5 months ago

From https://support.ehelp.edu.au/a/tickets/194200 - completed data and metadata templates are in the helpdesk ticket.

URL for dataset: https://depws.nt.gov.au/

cha801p commented 5 months ago

https://collections-test.ala.org.au/dataResource/show/dr22364

cha801p commented 4 months ago

Ticket Update: February 7, 2024 (11:30 PM)

Issue: New events dataset to load.

Solution: Successfully load the new dataset into biocache and events

Actions Taken: Successfully loaded the data on the test environment

Issues Encountered: Date structure discrepancies were found. This was due to unformatted dates and the extra column 'year'. This was fixed and the data was reloaded. (Strangely the initially created dr got deleted/vanished from the dr list and hence a new dr was created and the meta was filled in again.)

Loaded data for review: https://collections-test.ala.org.au/public/show/dr22364

Status: Waiting for the data provider

cha801p commented 4 months ago

Actions Taken: Follow-up email sent to the data provider Status: Waiting for the data provider

cha801p commented 4 months ago

Ticket Update: March 5, 2024 (11:30 PM)

Issue: New dataset to load.

Solution: Successfully load the new dataset into biocache

Actions Taken:

Logs: INFO [2024-03-05 00:36:09,697+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Pipeline complete. INFO [2024-03-05 00:36:09,697+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Checking for backups to prune..../data/pipelines-data/dr25184/1/identifiers INFO [2024-03-05 00:36:09,697+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Writing metrics..... INFO [2024-03-05 00:36:09,699+0000] [main] org.gbif.pipelines.common.beam.metrics.MetricsHandler: Trying to write pipeline's metadata to a file - /data/pipelines-data/dr25184/1/uuid-metrics.yml INFO [2024-03-05 00:36:09,700+0000] [main] org.gbif.pipelines.common.beam.metrics.MetricsHandler: Added pipeline metadata - newUuidsAttempted: 2599,

Loaded data for review: https://collections.ala.org.au/dataResource/show/dr25184

Status: Waiting for indexing

cha801p commented 4 months ago

https://collections.ala.org.au/dataResource/show/dr25184

cha801p commented 4 months ago

Ticket Update: March 6, 2024 (3 PM)

Issue: Issue with individual count identified

Solution: Successfully re-load the new dataset into biocache

Actions Taken:

Logs: INFO [2024-03-06 03:43:02,942+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Running the pipeline INFO [2024-03-06 03:43:03,821+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Checking the percentage change in new UUIDs: INFO [2024-03-06 03:43:03,822+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: newUuids: 0.0, preservedUuids: 2599.0, orphanedUniqueKeys: 0.0 INFO [2024-03-06 03:43:03,822+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Percentage UUID change: 0, allowed percentage: 50, override percentage check: false INFO [2024-03-06 03:43:03,822+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Backing up existing UUIDs to /data/pipelines-data/dr25184/1/identifiers/ala_uuid_backup_1709696583822 INFO [2024-03-06 03:43:03,823+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Pipeline complete.

Loaded data for review: https://collections.ala.org.au/dataResource/show/dr25184

Status: Waiting for indexing

cha801p commented 3 months ago

Links sent to the data provider.