Closed ruthtillman closed 6 months ago
@ruthtillman The hathi links show on the search page. Do we still want this to happen? Also, if there's a hathi link, we don't show the availability on the search page. If we do want to show the hathi links, do we want the availability to show if there's a hathi link?
So it looks like we're not currently showing them in search results, though we may have earlier on?
results 2 & 3 at least have HT links.
Let's maintain it as is, not showing in search results view but only on the page itself.
One thing I was thinking of while revisiting documentation -- we could choose not to make these calls on a subset of our formats: Audio, Equipment, Games/Toys, Image, Kit, and Video almost certainly shouldn't have it.
But I don't know if it's expensive enough for putting in those exceptions to be worth it -- in those cases neither HathiTrust nor Google Books should return data, so it would be silently querying and doing nothing and maybe that doesn't matter.
And I had a second thought while documenting the Google Books API call -- I think it's the same query we're using to generate thumbnails. Is it possible to avoid making a second call and check the thumbnail query for data but only display it if the HT link fails? Or is that too messy?
Reviewed on its own branch but just checked again in QA. Looks good.
Background
The earliest Catalog used the HathiTrust API. It's been 5 years so we can't just roll it back, but it might be worth looking at.
During lockdown, HathiTrust provided us with a CSV datafile matching our catalog holdings and their scans. We had indexed this along with our MARC data to provide keys which saved us the trouble of API lookups. At the time, we were using ETAS (Emergency Temporary Access Service) to access and digitally check out scans of books that are in copyright but were locked in our closed library. When they stopped providing ETAS checkouts, we switched to only show items which are in the public domain.
HathiTrust stopped sending us matching CSVs in July 2021. Since we were mostly concerned with pre-copyright books, this hasn't been a thing that updates much, but more books have come into public domain since then, more have been identified as out of copyright by teams working for HT, and some records have changed over time.
Current Behavior
We currently use data from our index to determine whether or not we can link to a HathiTrust scan of the book.
New Behavior
We'll want to send an API request similar to what we do with Google Books. We only want to display the links where we get full text copies. HathiTrust's search-inside-only functionality is far inferior to the Google Books search inside and we don't ever want to include it.
For the sake of being sure that updating this doesn't mean that we accidentally mess up interacting logic with Google Books, display logic should be:
HathiTrust API
The API uses only one identifier parameter. Choose the first one which exists in the following order:
Syntax is:
https://catalog.hathitrust.org/api/volumes/brief/oclc/424023.json
with appropriate param in place of oclc if neededWe are looking for at least ONE item which has:
"usRightsString":"Full view"
Sample with several items in Limited View only: https://catalog.hathitrust.org/api/volumes/brief/oclc/424023.json
Sample with only one item, Full View: https://catalog.hathitrust.org/api/volumes/brief/oclc/433.json
These two pages have more documentation:
Not Changing
We are not changing our indexing/indexed data at this point.