AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

NVA date update #1111

Open sadeghim opened 1 month ago

sadeghim commented 1 month ago

NVA eventDate has an Z at the end of some dates within their dataset which needs updating to be parsed by pipelines.

cha801p commented 1 month ago

Ticket Update: September 26 2024

Issue: Fix data format for the Tasmanian Natural Values Atlas (NVA) dr710.

Solution: Successfully remove the "Z" from date entries (e.g., changed "01-06-2017Z" to "01-06-2017").

Actions Taken:

Error Log: INFO [2024-09-25 08:00:37,649+0000] [main] au.org.ala.pipelines.util.VersionInfo: git.remote.origin.url=https://github.com/gbif/pipelines INFO [2024-09-25 08:00:38,776+0000] [main] au.org.ala.pipelines.beam.ALADwcaToVerbatimPipeline: Adding step 1: Options INFO [2024-09-25 08:00:38,776+0000] [main] au.org.ala.pipelines.beam.ALADwcaToVerbatimPipeline: Non-HDFS Input path: /data/biocache-load/dr710 25-Sep [0;90m08:00:38[0m [[0;35mLA-PIPELINES[0m] [[0;34mdr710[0m] [[0;31mERROR[0m] Unexpected error during DWCA-AVRO conversion dr710 step 25-Sep [0;90m08:00:38[0m [[0;35mLA-PIPELINES[0m] [[0;34mdr710[0m] [[0;31mERROR[0m] Error 1 occurred on 1

Issues Encountered:

Successfully loaded the data onto Databox and production environments.

Loaded Data for Review: Test: Collections Test - DR710 Production: Collections Production - DR710

cha801p commented 1 month ago

Prod UUID count logs: 24/09/26 08:20:48 INFO SparkContext: Successfully stopped SparkContext 24/09/26 08:20:48 INFO ALAUUIDMintingPipeline: Checking the percentage change in new UUIDs: 24/09/26 08:20:48 INFO ALAUUIDMintingPipeline: newUuids: 0.0, preservedUuids: 1121933.0, orphanedUniqueKeys: 0.0 24/09/26 08:20:48 INFO ALAUUIDMintingPipeline: Percentage UUID change: 0, allowed percentage: 50, override percentage check: false