OregonDigital / oregondigital

OregonDigital Hydra Application
https://oregondigital.org/catalog/
Other
25 stars 5 forks source link

Add Identifier values for Sheet Music items without Identifier #1408

Closed wickr closed 3 years ago

wickr commented 3 years ago

About 90% of Sheet Music items don't have an Identifier value, which is now a required field in OD2.

The identifier can be made from the filename of the original upload, without the file extension.

wickr commented 3 years ago

I was wrong, the Sheet Music items didn't retain the original filename (anywhere in Fedora that I can see), so the content datastream label is just 'File Datastream' for all of them.

Creating identifiers manually is still an option, either one-by-one or with a spreadsheet.

Or it's probably possible to take the Titles and do something like change spaces to underscores and remove punctuation.

sseymore commented 3 years ago

Dark archives dump: https://drive.google.com/file/d/1cwOBK23zrj6Z2zxOQcpccf4Kw6nv_VJo/view

Former dump to use to remediate: https://docs.google.com/spreadsheets/d/1Fp0f23-hXLuuMrLSD5k6gbuneXR7PZ0pXiW2DpI0w8k/edit#gid=1877374817

wickr commented 3 years ago

First pass with the matching CSV is complete, changed 1094 items.

A quick solr check found 34 items still without identifiers.

wickr commented 3 years ago

Round 2 of 34 values is now complete. A solr check says 0 items in the collection without identifiers.