AtlasOfLivingAustralia / data-management

Data management issue tracking
7 stars 0 forks source link

dr15584 - Dung Beetle #863

Closed rosemaryjoconnor closed 4 months ago

rosemaryjoconnor commented 1 year ago

New data for Dung Beetle is to be sent soon. Russell Barrow had emailed us in Jan but it has slipped off the radar. I have taken a look at his sample dataset for the new load and it is in good shape. The main issue is they have indicated that catalogNumbers are going to change. I've emailed for them to clarify if the existing records will have changed catalogue Numbers also and if so, can they provide a CSV with old->new mapping.

Once we have clarification the upload should be straightforward

rosemaryjoconnor commented 1 year ago

New code written to extract to CSV. Found that 486 catalogNumbers are duplicates - some have 3 or more records. Have sent list of these to Dung Beetle team

rosemaryjoconnor commented 1 year ago

Russell Barrow has responded. The dataset provided to date needs to be added to the existing ALA data. It is a whole new set of data with catalognumbers derived differently from the existing data but there should be no clashes. When a new full dataset is provided, the existing ALA data in that dataset should not have new catalogNumbers so no mapping required. I'll get this into databox and let Russell review.

rosemaryjoconnor commented 1 year ago

Jenkins preingest load dataset run successfully. Checking tomorrow after index run

rosemaryjoconnor commented 1 year ago

New records successfully loaded in production

rosemaryjoconnor commented 1 year ago

Only the 789 new records were entered. There are a further 12K that need to be done.

rosemaryjoconnor commented 1 year ago

All 13480 records successful in databox and also Production. Just waiting for the index to run overnight.