emory-libraries / blacklight-catalog

1 stars 2 forks source link

Revise MARC 856 mappings for url_fulltext_ssm and url_suppl_ssim #1162

Closed eporter23 closed 2 years ago

eporter23 commented 2 years ago

The catalogers and the ECR team are working to establish better practices for the use of 856s in catalog records. Their long term goal is to remove 856s from "print" material records and use portfolio records instead.

In our current catalog, there are many instances of 856s added to print records, many of which are broken links and/or are incorrectly coded so that online access to full text is indicated, but only supplemental material such as a table of contents is available. Since this can cause frustration for users, the following intermediate workaround is recommended if feasible. This is based on a review of sample records and should fix about 80,000 incorrectly coded 856s.

Revise mappings for the following:

url_fulltext_ssm Include the current logic, but exclude the following: If ind2=0 or 1 and exactly matches one of these strings in $3, $y, or $z:

Table of contents Table of contents only Publisher description Cover image Contributor biographical information

url_suppl_ssim Include the current logic, plus include 856s with ind2=0 or 1 that contain exact matches on the strings listed previously.

Table of contents Table of contents only Publisher description Cover image Contributor biographical information

Note: we don't want to use a "contains" type logic here because sometimes a phrase like "table of contents" may be part of a longer phrase such as "Full text including table of contents".

lovinscari commented 2 years ago

@eporter23 - can you please review this when you get in this morning?

eporter23 commented 2 years ago

@lovinscari @abelemlih I think this looks good. This change has reduced the overall count of "online" items in test by 362,911 items. This is a little more than I'd originally estimated, but I'm not concerned by that as cataloging has told us most of the 856 "full-text" links in MARC records are not good and long term they plan to remove them. I also spot checked about 30 records in test and all seem correct.