AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

ReInterpretation of BASE data dr10487 #994

Closed peggynewman closed 6 months ago

peggynewman commented 7 months ago

the BASE dataset, dr10487, is an eDNA dataset, but somehow the Environmental DNA contentType field is missing from the data. I've added it now to the collectory. Please rerun the interpretation and index for dr10487 such that it picks up the change in collectory and creates a content type field in the data. Then it will be excluded correctly by the data quality profiles.

cha801p commented 6 months ago

Ticket Update: December 11, 2023 (3:30 PM)

Problem: ReInterpretation of BASE data dr10487

Solution: Successful ReInterpretation of dr10487

Actions Taken: Data was reinterpreted (Ran Ingest_large_datasets DAG with "skip_dwca_to_verbatim": "true")

Logs: Collectory Change: Content types: Environmental DNA Record count: 1,024,190 records(https://biocache.ala.org.au/occurrences/search?q=data_resource_uid:dr10487) UUID Log: 23/12/11 04:50:15 INFO ALAUUIDMintingPipeline: newUuids: 0.0, preservedUuids: 1024190.0, orphanedUniqueKeys: 0.0

Verify:

cha801p commented 6 months ago

Ticket Update: December 14, 2023 (3 PM)

Problem: ReInterpretation of BASE data dr10487

Solution: Successful ReInterpretation of dr10487

Actions Taken:

Possible Solution:

Status: Work Pending

peggynewman commented 6 months ago

This might be a problem with the DQ filters. I've created a ticket on the systems team board. https://github.com/orgs/AtlasOfLivingAustralia/projects/15/views/15?pane=issue&itemId=47637256

I'm going to close this now. The contentTypes are there in SOLR.