galterlibrary / digital-repository

DigitalHub - Institutional Repository for Galter Health Sciences
https://digitalhub.northwestern.edu/
5 stars 1 forks source link

Unmapped Subject: Name (LCNAF) terms #1090

Closed gneidhardt closed 2 years ago

gneidhardt commented 2 years ago

I wasn't sure which existing issue to add this to - feel free to move it/delete it as needed.

https://digitalhub.northwestern.edu/files/e342fabd-2faa-42bf-bd39-582fdac49a0d https://vtfsmghslrepo02.fsm.northwestern.edu/records/9trpe-92w83

have one term in DH that didn't make it to Prism - it's the Subject: Name term "Feinberg School of Medicine. Center for Community Health", which isn't in LCNAF. I understand why this mapping failed, but am unsure what the best solution is. Some thoughts:

  1. Can we run a report of all terms in "official" vocabularies in DH that have failed, and change them manually? Some will be typos, but some I think are just entered erroneously and will likely need to be made keywords. This seems to be a better solution in terms of cleaner data, but more time consuming.
  2. Is there a way to automatically make a "failed" term a keyword? This seems possibly too complex, but I wasn't sure.
gneidhardt commented 2 years ago

Another example:

"'Ayn al-Turk (Algeria)" didn't map as Geographic Subject from https://digitalhub.northwestern.edu/files/94afbf13-d7fc-425c-8dc2-8862ed940569 to https://vtfsmghslrepo02.fsm.northwestern.edu/records/gfs8m-syn24 in either front end or back end.

I wonder if this is due to that opening apostrophe - is it not matching exactly from our system to LCNAF? Here's the LC record: https://id.loc.gov/authorities/names/no2019036071.html

gneidhardt commented 2 years ago

Example:

"Galter Health Sciences Library" didn't transfer as a Name Subject from https://digitalhub.northwestern.edu/files/library-notes-23 to https://vtfsmghslrepo02.fsm.northwestern.edu/records/yrcd0-rc882 in either front or back end.

LC record: https://id.loc.gov/authorities/names/no2017056937.html

Another case: "Tríbeč Mountains (Slovakia)" didn't transfer as Geographic subject from https://digitalhub.northwestern.edu/files/593e87a3-400d-4b17-8a47-e6da39788fe6 to https://vtfsmghslrepo02.fsm.northwestern.edu/records/ev96s-nh454. Assume because of stripping of diacritics.

LC record: https://id.loc.gov/authorities/subjects/sh85137421.html

Another case: "Northwestern University (Evanston Ill.). Woman's Medical School" (https://lccn.loc.gov/no2012053553) didn't map at all from DH to Prism https://digitalhub.northwestern.edu/files/dca9a317-a456-4349-9a18-24ee104c5afd https://vtfsmghslrepo02.fsm.northwestern.edu/records/4x015-69625

Meowcenary commented 2 years ago

@gneidhardt - sorry for not getting to this sooner. I'm planning on releasing tomorrow and since we log failed lookups we can take the output from the latest migration and just filter for those results. We can format it so it's something easy to work with in excel or whatever is best for you. I'll assume CSV is okay unless I hear otherwise

Meowcenary commented 2 years ago

To complete this take the latest export log and filter for failed subject lookups (LCNAF, LCSH, etc) and then send this to Gretchen so she can look through it.