AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

Merge Data Load : APC data for December 2023 to February 2024 #1038

Closed cha801p closed 2 weeks ago

cha801p commented 3 months ago

Ticket Update: March 6, 2024 (2 PM)

Issue: Refresh of APC data for December 2023 to February 2024

Resolution: Load the dataset onto the databox

Actions Taken:

Prod Log Records: INFO [2024-03-06 03:15:02,860+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Checking the percentage change in new UUIDs: INFO [2024-03-06 03:15:02,861+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: newUuids: 149.0, preservedUuids: 7252.0, orphanedUniqueKeys: 108.0 INFO [2024-03-06 03:15:02,861+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Percentage UUID change: 2, allowed percentage: 50, override percentage check: false INFO [2024-03-06 03:15:02,861+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Backing up existing UUIDs to /data/pipelines-data/dr8128/1/identifiers/ala_uuid_backup_1709694902861 INFO [2024-03-06 03:15:02,862+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Pipeline complete.

Links:

Challenges Encountered:

Useful Prod Stats before indexing (Testing purposes): 7,206 records returned of 7,252 for Data resource: Australian Platypus Conservancy Exclude duplicate records 33 records excluded Exclude records with unresolved user annotations [(13 records excluded)]

Status: Waiting for indexing