Closed rotated8 closed 1 year ago
@tclayton33 @rotated8 I have started working on this feature, please refer to this pull request: https://github.com/emory-libraries/blacklight-catalog/pull/1359
@tclayton33 I have a pull request ready for review. Once it is approved and merged, I will reach out regarding reindexing Arch and starting testing for the new language filter added.
That's great. Thanks @abelemlih
v1.10.0 has been released to Test and Arch. Once a full reindex is complete, this ticket will be ready for testing.
Adding enhancement scoring: Value score = 7; Library Search Committee patron impact score = 3
@abelemlih Overall this is looking good. It's working well for displaying headings that just consist of $a as well as those with multiple subfields. The facets work well. The one area that's somewhat inconclusive is with search. For example if I search "Gender identity disorders in children" I get 2 results. But if I search for the replacement term "Gender dysphoria in children" I get 12 results. We suspect this is happening because for this term LC had actually changed the heading, so most of the 12 records have already been corrected through the authority control process - i.e. the phrase "Gender dysphoria in children" is already present in a subject heading of the Marc record.
I'd like to get your thoughts on if this is fixable (searching "Gender identity disorders in children" would bring up the same 12 records that are retrieved by searching "Gender dysphoria in children" and if so, how complicated is the fix? I'd also like your opinion on if you'd rather try to address this before deploying to production or work on it when we add the full list of terms. (Sofia and I are in favor of moving this into production before this is resolved so we can show a wider audience the prototype.)
@tclayton33 I emailed you data for fields subject_ssim
, subject_tesim
, and subject_display_ssim
to review and create a list of replacements for harmful terms.
prototype deployed 10/12/23 so closing this initial ticket; will create a new ticket once we have a full list of terms to incorporate
Cataloging standards (Library of Congress Subject Headings) include language we would like to avoid showing to patrons.
Example: https://search.libraries.emory.edu/catalog/990010563920302486 This record includes a term "Gender identity disorders" we want to replace this term with "Gender dysphoria".
The replacement term should show up in the catalog's facets, and on the item's display record. We do not want to change the term in a record's title, or MARC data.
On the Solr side, the
subject_display_ssim
,subject_ssim
, andsubject_tesim
should contain the new term. Additionally, thesubject_tesim
should include the old term in addition to the new term-- effectively adding the replacement term in addition to the existing one.When making replacements, we should log the ID of the record so we have a list of all the records we are changing.
The list of terms can live in our repository, although a README.md should exist in the same folder as the terms so users are less likely to stumble across them.
Terms to use for development:
Notes: