psu-libraries / psulib_traject

Penn State University Libraries' Blacklight Catalog Traject Indexer
Apache License 2.0
2 stars 0 forks source link

Generate and index shelf keys for browsing titles #331

Closed banukutlu closed 1 year ago

banukutlu commented 3 years ago

We can use the title_sort field for creating a browseable list of titles, but it isn't optimal for display. Instead of trying to have different fields, one for sort, one for display, we can make sorted field that can be "unsorted" later.

To-Do

Update title_sort to retain the trimmed elements, ex: The Great Gatsby becomes Great Gatsby The. For the display, the original title is reassembled:

> "Great Gatsby The".split.rotate(-1).join(' ')
=> "The Great Gatsby"
ruthtillman commented 3 years ago

Question: is it better to just create a left-anchored title search option which disregards stop words? And, if so, should we rely on Solr's stop words or would we want to rely on stop-words from the MARC records, where the 245 2nd indicator contains the exact number of characters (including space) to strip off the beginning of the title, e.g. https://catalog.libraries.psu.edu/catalog/25380626/marc_view 245 2nd indicator is "4" which would strip The.

awead commented 3 years ago

~I think using the 245 indicators is going to be better. I'm not sure if Solr's stop word list applies to sorted fields. AFAIK, it only applies to indexed fields. In other words, fields with stop words in them will not have those words in the index, so you couldn't search for them, but they'd still displayed and sorted on.~

awead commented 3 years ago

I've updated the description to reflect a new approach. We can update the existing title_sort field so that the original title can be reassembled in the catalog for display.

ruthtillman commented 1 year ago

Not going to do title browse for now.