pulibrary / dspace-development

DSpace infrastructure and development resources for the Princeton University Library.
https://dspace-development.readthedocs.io/en/latest/
1 stars 0 forks source link

Ensure that the DataSpace 2021/2022 MARC records are imported into Alma #704

Closed jrgriffiniii closed 1 year ago

jrgriffiniii commented 1 year ago

The expected number of records discoverable are the following:

May ‘21 78 records
Sept ’21 186 records
Nov ‘21/Jan ’22 80 records
Other file has one individual record
jrgriffiniii commented 1 year ago

Following further investigation by collection curators, the following errors were discovered:

In addition to the May 2021 file containing 78 records, there are also many records missing from the September 2021 file. I searched the first 16 records from the September 2021 file, and none of them appeared in Alma. Mark had searched a couple titles from this file (“Big Data in Financial Economics” and “Testing and Analyzing Correctness in Concurrent Systems”), and these were present. There should be 186 records in this file.

These must be addressed, and further, the Alma imports must be verified in Orangelight following 12/12/22.

jrgriffiniii commented 1 year ago

https://github.com/jrgriffiniii/etd-processor/pull/6 were the last modifications needed to ensure that this is functioning. I am running this now for inserting the ARKs.

jrgriffiniii commented 1 year ago

https://github.com/jrgriffiniii/etd-processor/pull/7 provides a task for generating MARC record reports as well.

jrgriffiniii commented 1 year ago

I have confirmed that the total number number of records for each batch matches what is expected. I am now manually ensuring that those records without matching ARKs should indeed be without ARKs.

jrgriffiniii commented 1 year ago

https://dataspace.princeton.edu/handle/88435/dsp01th83m246x was not found by query due to errors arising from characters (, ), and ~ in the title.

jrgriffiniii commented 1 year ago

More encoding issues arose with trying to match an ARK to https://dataspace.princeton.edu/handle/88435/dsp01pr76f6551

jrgriffiniii commented 1 year ago

Search query params needed to be encapsulated with double-quotation characters in order match an ARK for "How Do States Negotiate?"

jrgriffiniii commented 1 year ago

More adjustments were needed to address cases where there were multiple spaces in DataSpace where there were single spaces in the MARC records.

jrgriffiniii commented 1 year ago

Major encoding conflicts arose with regards to encoding problems between the MARC records and the DataSpace metadata. There are still 2 items which I need to inspect, but this may well be resolved.

jrgriffiniii commented 1 year ago

https://github.com/jrgriffiniii/etd-processor/pull/9 finally produces a full set of updated records with matching ARKs.

jrgriffiniii commented 1 year ago

The request to import the records will need to be submitted on 12/12/22.

jrgriffiniii commented 1 year ago

These were confirmed as having been successfully loaded into Alma on 12/16/22.