cul-it / qa_server

A rails app with questioning authority gem installed to serve as a QA server.
Apache License 2.0
1 stars 6 forks source link

Add Direct Lookup for Homosaurus #374

Closed sfolsom closed 8 months ago

sfolsom commented 11 months ago

There is a "Search" at the top menu of homosaurus.org. Doing a search gives results that include other formats at the bottom. One can replace the "q=" paramter for one of those formats for a machine readable format. For example, for a turtle format for "ze" would be: https://homosaurus.org/search/v3.ttl?q=ze

Because this returns RDF, we might(?) be able to use the linked data module, https://github.com/cul-it/qa_server/blob/dev/config/authorities/linked_data/homosaurus_ld4l_cache.json (Would need to point to Homosaurus instead of http://ld4l.org/ld4l_services/cache.)

NB, when RDF results lack ranking predicate QA sorts results alphabetically.

chrisrlc commented 9 months ago

Current direct lookup deployed to: https://lookup-int.ld4l.org

Current scenarios pass, but I see that I've forgotten to copy over scenario cases from the ld4 cached version. I'll get this pushed to my branch, but let me know if you also want to be able to test those on lookup-int.

Right now search results are sorted alphabetically by label - did we want to use default homosaurus ranking instead?

sfolsom commented 9 months ago

Looks great! It's really nice that the context=true is working. In most (all?) cases, I don't think Sinopia is set up to page through alphabetical results from QA, so the only change needed is to switch from alpha to relevancy. Sinopia users are assuming a classic keyword search and tweaking their search strings until they find the most relevant result.

chrisrlc commented 9 months ago

Unfortunate news: sorting by label appears to be the default for ld authority configs. That can be overridden only if there's another field available to sort on. I can attempt to override this logic, but it might be better to do as a separate ticket to "Allow ability to retain default sorting from external linked data service". Do we want to hold off on deploying these changes to prod so that we can work on that change, or do you think that the default sort by label is sufficient for release?

chrisrlc commented 9 months ago

Side note: the current sorting behavior for the homosaurus_direct search results is the same as all the other QA ld direct lookups.

sfolsom commented 9 months ago

Yeah, we might have to live with alpha sort for now. I like the idea of a separate ticket for thinking about how to retain relevancy sort.

Are all the tests for Homosaurus in int? When I run the tests, I see only 2 tests (one term fetch and one search for "ze"). https://lookup-int.ld4l.org/check_status?utf8=%E2%9C%93&authority=HOMOSAURUS_DIRECT&compare_with=&validation_type=all_checks.

It would be nice to run these tests (https://github.com/cul-it/qa_server_aws_deploy/blob/5785b8dec982a29ca1171bd6c3e6f03c897b7d87/config/authorities/linked_data/scenarios/homosaurus_ld4l_cache_validation.yml#L16) on the alpha sort to see how well they perform before making a final decision about pushing to production.

sfolsom commented 8 months ago

Looks good in Sinopia as-is. We can reopen if something comes up.