galterlibrary / digital-repository

DigitalHub - Institutional Repository for Galter Health Sciences
https://digitalhub.northwestern.edu/
5 stars 1 forks source link

[Export] Missing diacritic causes LCNAF term to be exported as non LCNAF #1135

Closed fenekku closed 1 year ago

fenekku commented 1 year ago

adapted from Gretchen's comment

Example:

DH: https://digitalhub.northwestern.edu/files/203ffceb-4d19-4867-82c5-dc3fc2778b36 Prism: https://vtfsmghslrepo02.fsm.northwestern.edu/records/6fdd1-swc83

Looks like this will affect all items with the following subjects: “Chicoutimi (Quebec)"; "Quebec (Province)"; "Saguenay River (Québec)"

Make sure we get the LCNAF term at export.

Meowcenary commented 1 year ago

I think this should no longer be an issue now that the header updates are finally finished. I'm going to leave it as "in progress" until the next release and export cycle. I'm planning on releasing and running the export again this Thursday, November 17th

Meowcenary commented 1 year ago

I changed my mind and decided this was better in "blocked" until the release.

Meowcenary commented 1 year ago

Edit: Actually, I think we can just check this file directly and see if the LCNAF terms are showing up in the export side of things.

Latest export has been sent out so I'll move this to in progress. After the import we can check some of the records in question and see if things are mapping correctly.

fenekku commented 1 year ago

As of the 2022-11-17 export/import, the DH terms with "Quebec" in them have been mapped to their correct subject term ids (LCSH/LCNAF/MeSH).

The example case above (DH: https://digitalhub.northwestern.edu/files/203ffceb-4d19-4867-82c5-dc3fc2778b36) is a bit weird because the Keywords (e.g., "Saguenay River (Quebec)") repeat the subject terms (e.g., https://id.loc.gov/authorities/subjects/sh86006143 - Saguenay River (Québec) ). So the export returns "subjects" that seem to have duplicates, but only because the original record seems to have some. But this has to do with metadata curation and not code at this point and the data ingestion is doing the right thing.

Closing as fixed. (We'll let metadata librarians fix the "Chicontimi" keyword for "Chicoutimi" on DH).