pulibrary / pulfalight

This is an implementation of the Princeton University Library Finding Aids (PULFA) service using ArcLight
Other
7 stars 1 forks source link

create an LCSH masking mechanism as in "Change the Subject" #1088

Closed phoebeno closed 2 years ago

phoebeno commented 2 years ago

To mask offensive Library of Congress Subject Headings, this remapping list is in place in the library's catalog to hide offensive LCSH from users. Catalogers and processing archivists continue to need to use the LCSH to link our resources to others through that widely-used controlled vocab.

The Inclusive and Reparative Metadata Working Group is responsible for creating the list of LCSH terms to replace, along with the replacement terms PUL will use to cover LCSH terms. For instance, IRMWG has recently approved changing the terms "Indians of North America", "Indians of South America", and "Indians of Mexico" (all used in this finding aid, as one example). We request that those terms be replaced with "Indigenous peoples of North America," etc., in the PULFA user interface. Changes in PULFA should align with changes in the PUL catalog, maintained on github in the list linked above (Kevin Reiss, Carolyn Cole, Christina Chortaria, and Jane Sandberg, maybe now also Max Kadel and Ryan Laddusaw have been involved in this work). In finding aids, LCSH are used at the collection level and also at the component level. Please let us know if the type of masking requested can be based on text strings, or if we would need to match up SH records in ASpace (yikes!) with the replacement terms.

Success Criteria

As an example, if I search for "Indians of Mexico", and click the component which matches, I see it say "Indigenous peoples of Mexico".

phoebeno commented 2 years ago

also, @tpendragon, members of IRMWG and IDWG wonder what a time frame for this could be--do you have any idea if it can fit along with other issues into the next work cycle, for instance? thanks

tpendragon commented 2 years ago

@phoebeno Do you mean the next Pulfalight work cycle? If so, I'd be a big supporter of this happening then.

phoebeno commented 2 years ago

@tpendragon Yep, that is what I meant.

hackartisan commented 2 years ago

Here's how it's done in orangelight: https://github.com/pulibrary/bibdata/blob/7c3e63c479ac2f9c8a59b53cfdbe878763ecf5bf/marc_to_solr/lib/traject_config.rb#L810-L832

And the support class: https://github.com/pulibrary/bibdata/blob/38405f611eccbdcc703b68ca91f4f0b79a7dd08e/marc_to_solr/lib/change_the_subject.rb

The config file (also linked above) is yaml: https://github.com/pulibrary/bibdata/blob/6630d8cadbfb8acd8a34aff7664e8821acb313d6/marc_to_solr/lib/change_the_subject/change_the_subject.yml

Proposed implementation:

hackartisan commented 2 years ago

The relevant field in pulfalight is called subject_terms_ssim

There's also a field called access_subjects_ssim which comes from upstream arclight. I don't think this field is relevant here but I'm not sure what it is exactly. Related arclight issues / PRs:

phoebeno commented 4 months ago

Hi @tpendragon @hackartisan @eliotjordan I just began wondering how often this list https://github.com/pulibrary/change_the_subject/blob/main/config/change_the_subject.yml gets updated for PULFAlight. I noticed that "Gays" was not being masked in PULFAlight. We have now changed "Gays" to "Gay people" (the newer LCSH term) in our data in ASpace, but the term had me wondering whether we might update the finding aids change-the-subject list to match the one in the catalog. Thank you!

tpendragon commented 4 months ago

@phoebeno We just updated this like yesterday here: https://github.com/pulibrary/pulfalight/pull/1425. It should have been done on the reindex, which I think happens nightly. Do you have a broken example?

Edit: I'm noticing I may not have deployed our update yet - I just started a deploy, so any masking that needs to happen should get fixed tonight if it's not already.

phoebeno commented 4 months ago

Oh, wild, it's like ESP. Thank you!