Closed timhicks-ala closed 2 months ago
Ticket Update: March 26, 2024 (5 PM)
Issue: Data Refresh
Solution: Successfully load the new dataset into biocache
Actions Taken: Successfully loaded the data on test
Loaded data for review: Metadata: data
Logs: _INFO [2024-03-21 06:30:35,144+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Running the pipeline INFO [2024-03-21 06:30:36,071+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Checking the percentage change in new UUIDs: INFO [2024-03-21 06:30:36,073+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: newUuids: 686.0, preservedUuids: 2409.0, orphanedUniqueKeys: 0.0 INFO [2024-03-21 06:30:36,073+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Percentage UUID change: 22, allowed percentage: 50, override percentage check: false INFO [2024-03-21 06:30:36,073+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Backing up existing UUIDs to /data/pipelines-data/dr2592/1/identifiers/ala_uuid_backup1711002636073 INFO [2024-03-21 06:30:36,073+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Pipeline complete.
Status: Waiting for confirmation from the data provider
Ticket Update: April 3, 2024 (5 PM)
Issue: Data Refresh
Solution: Successfully load the new dataset into biocache
Actions Taken: Successfully loaded the data on biocache
Data review Columns renamed - occurrenceID to catalogNumber DwcA created locally Loaded the data on collectory Ingest_small_dataset kept failing Reingested the data
Problems encountered: Incremental load was set to True but was not reflecting on collectory This created orphaned records as dwca-imports was replaced by new dwca (with only new records) Old dwca was replaced on s3 and re-ran the preingestion
Loaded data for review: Metadata: data
Logs: INFO [2024-04-03 03:27:49,897+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Create ALAUUIDRecords and write out to AVRO INFO [2024-04-03 03:27:49,926+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Running the pipeline INFO [2024-04-03 03:27:50,808+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Checking the percentage change in new UUIDs: INFO [2024-04-03 03:27:50,809+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: newUuids: 686.0, preservedUuids: 2409.0, orphanedUniqueKeys: 0.0 INFO [2024-04-03 03:27:50,810+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Percentage UUID change: 22, allowed percentage: 50, override percentage check: false INFO [2024-04-03 03:27:50,810+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Backing up existing UUIDs to /data/pipelines-data/dr2592/1/identifiers/ala_uuid_backup_1712114870810 INFO [2024-04-03 03:27:50,810+0000] [main] au.org.ala.pipelines.beam.ALAUUIDMintingPipeline: Pipeline complete.INFO [2024-04-03 03:27:50,810+0000
Status: Waiting for indexing
Status: Review links sent to the data provider
Went to prod last week. Is this good to close @cha801p?
@peggynewman Review links have already been sent to the data provider just waiting for confirmation. I will send a follow-up email today.
From https://support.ehelp.edu.au/a/tickets/197899
Existing data resource (2,409 records): https://collections.ala.org.au/public/show/dr2592
The newly supplied file has ~680 records, so likely it is an additional update rather than the whole dataset.
A new metadata file has also been provided.