pulibrary / bibdata

Local API for retrieving bibliographic and other useful data from Alma (Ruby 3.2.0, Rails 7.1.3.4)
BSD 2-Clause "Simplified" License
16 stars 7 forks source link

Index 653 contents as keywords #2041

Closed escowles closed 8 months ago

escowles commented 1 year ago

Index MARC field 653 (uncontrolled index terms) contents as keywords

Concrete example

Re-index required?

Implementation notes, if any

Approved by DSSG in April 2021

sandbergja commented 1 year ago

@escowles a few questions about this one:

escowles commented 1 year ago

@sandbergja Sorry I didn't see your questions. We do want them indexed for both search and display, displaying them below subjects with the heading "Keywords". I'll ask DSSG for examples.

escowles commented 1 year ago

Apparently we have been stripping this field because it isn't being indexed, so we don't have sample records. So we can probably use the example values from the MARC docs.

kevinreiss commented 1 year ago

Display these values as regular text not links. Make a ticket in OL to do thi after indexing.

maxkadel commented 1 year ago

Starting in February 2023, the 653 fields were no longer being stripped. @mzelesky is finding examples to share for testing.

mzelesky commented 1 year ago

Record with a 653 field that has multiple subfield a: 9925459873506421

Record with multiple 653 fields: 9925493943506421

Miscellaneous records with 653 fields: 9925508433506421 9925505643506421 9925496713506421 9925496093506421 9925494973506421 9925494093506421 9925493823506421 9925492163506421 9925474453506421 9925473353506421 9925466353506421

mzelesky commented 1 year ago

Here is a report of all 653 fields in PUL records, including the number of bibs that contain each field. https://docs.google.com/spreadsheets/d/1VgWYcZ1yW1EdlxN4G2tX5chIMOvZohXfg84tT6wwgBw/edit?usp=sharing

maxkadel commented 1 year ago

Per Esmé, whether / how this field should be indexed is still under discussion by DSSG.

escowles commented 1 year ago

Jennifer reported back that she looked for email or notes on why she requested this and couldn't find anything. So I'm going to close this issue and we can revisit if she remembers where this request came from or someone else asks about it.

escowles commented 1 year ago

Reopening as new use cases for indexing 653 have surfaced. Minjie noted that non-Roman items because other fields included Romanized values, often without translation, but 653 includes English-language topic/format terms. A sample record is: https://catalog.princeton.edu/catalog/9951301943506421 (which includes "toy making" in a 653 which is otherwise not included in the record).

DSSG decided that these should be indexed but not displayed.

sandbergja commented 1 year ago

We can discuss this in the next Orangelight meeting.

maxkadel commented 1 year ago

@mzelesky is following up with stakeholders and will come up with a plan. May take a few weeks.

mzelesky commented 1 year ago

A meeting is scheduled for next week.

kevinreiss commented 1 year ago

No final decision until Jennifer returns from vacation at the end of Sept.

mzelesky commented 1 year ago

I met with Minjie Chen, Don Thornbury, and Beck Davis today about the 653 fields in Cotsen records.

The agreed plan of action is to compare all the 653 fields in the Cotsen records with all LCSH authorities (authorized headings and references) to see which ones can be easily mapped to LCSH with no further intervention.

After that process, we will evaluate what to do next. One potential solution raised by Minjie was to add 500 fields for the remaining Cotsen records with 653 fields similar to the following format:

500 __ $a Uncontrolled keywords: Toy-making; Toys, wooden.

maxkadel commented 1 year ago

@mzelesky - Moved to backlog. Let us know whether further action is needed from the DACS team from this, or whether it can be closed.

mzelesky commented 8 months ago

I believe this can be closed. Other avenues are being pursued for the 653 fields in Cotsen.

kevinreiss commented 8 months ago

Discussed at Alma-DACS 1/9. CaMS is pursuing other avenues to make this data usable using LCSH subject headings added to the records.