sul-dlss / happy-heron

Self-Deposit for the Stanford Digital Repository (SDR): H2 is a Rails web application enabling users to deposit scholarly content into SDR
Apache License 2.0
10 stars 2 forks source link

FAST vocabulary type ahead lookup in H2 should work #3581

Closed amyehodge closed 1 month ago

amyehodge commented 1 month ago

Currently this box in H2 behaves as a free text entry box. No type ahead search results appear when you type in the box.

App 176307 output: W, [2024-07-22T09:40:55.455746 #176307] WARN -- : [f24fa9fe-2590-42b5-ae17-5047e08d6a50] Autocomplete results for river returned 404

edsu commented 1 month ago

Maybe it would be helpful to see if this service is now available at a supported REST endpoint instead of talking directly to their Solr index?

https://www.oclc.org/developer/api/oclc-apis/fast-api/assign-fast.en.html

edsu commented 1 month ago

It seems like OCLC's FAST user interface is now sending Solr requests like this:

http://fast.oclc.org/cgi-bin/perlProxy.pl?http://authfastdb-m1.prod.oclc.org/fastapps-db/fastIndex/select?q=keywords:(village)&rows=10&start=0&version=2.2&indent=on&fl=id,fullphrase,type,usage,status&sort=usage%20desc

amyehodge commented 1 month ago

From OCLC FAST Team:

I believe our service is working okay but perhaps you are using an ‘experimental.worldcat.org’ url? We recently discontinued that avenue in favor of https://fast.oclc.org/searchfast/fastsuggest?&query=stuff. It should function the same, but if you run into any difficulties, please let us know.

Hope this helps, Russell

-- Russell Schelby OCLC, Technical Manager, Authorities schelbyr@oclc.org

justinlittman commented 1 month ago

@amyehodge We are using http://fast.oclc.org/fastIndex/select

justinlittman commented 1 month ago

https://fast.oclc.org/searchfast/fastsuggest?&query=stuff doesn't help because it doesn't include any identifiers.

amyehodge commented 1 month ago

@justinlittman I've sent another message to the FAST team about this.

edsu commented 1 month ago

From the docs it looks like you can request that idroot be returned? Are those the identifiers that we have been using previously? The IDs look like fst01140419. Here's an example API call:

http://fast.oclc.org/searchfast/fastsuggest?&query=hog&queryIndex=suggestall&queryReturn=suggestall%2Cidroot%2Cauth%2Ctag%2Ctype%2Craw%2Cbreaker%2Cindicator&suggest=autoSubject&rows=3

amyehodge commented 1 month ago

@arcadiafalcone Can you answers @edsu 's question above about the identifiers? Thanks!

arcadiafalcone commented 1 month ago

@edsu The identifiers used for what purpose? For the descriptive metadata, we're using the full URI, which can be derived from the idroot, such as https://id.worldcat.org/fast/1140419/ for the above example, but I may be misunderstanding the question.

edsu commented 1 month ago

Thanks @arcadia. I'm just confirming that we could switch H2 over from using

http://fast.oclc.org/fastIndex/select

(which is now gone) to:

http://fast.oclc.org/searchfast/fastsuggest

As long as we request that it return identifiers (idroot). We were initially concerned that the new service didn't seem to be returning identifiers.

edsu commented 1 month ago

Given that we had to wait for an H2 user to notice that this was broken I wonder if this is an opportunity to move the unit test over to querying the live service instead of mocking it out?

justinlittman commented 1 month ago

That should probably be an OKComputer check instead of a unit test.

edsu commented 1 month ago

I could understand OKComputer to make sure the URL still is 200 OK. But do we use OKComputer tests to ensure that the response format is still what we expect it to be?

mjgiarlo commented 1 month ago

@edsu @justinlittman Instead of hitting the live API in the test suite, or hammering it as part of OKComputer checks, how about we send Honeybadger alerts when the endpoint returns non-200 results? I'm going to work up a PR for that in the meantime.

https://github.com/sul-dlss/happy-heron/pull/3593

justinlittman commented 1 month ago

worksforme