sul-dlss-deprecated / dor_indexing_app

An indexing API for Stanford's Digital Object Repository
https://sul-dlss-deprecated.github.io/dor_indexing_app/
Apache License 2.0
0 stars 2 forks source link

Solr query log analysis for Argo #1099

Open ndushay opened 7 months ago

ndushay commented 7 months ago

query terms - what are user entered query terms? How many are used? How many are druids? Titles? tags? etc.

facets used

...

ndushay commented 7 months ago

Bad news 1: we only keep 7 days of Solr logs.

Bad news 2: I grabbed the logs from the 3 sul-solr VMs containing argo prod. I then grepped for "argo" because the sul-solr VMs are used for multiple collections. I combined these logs locally. Then I grepped for path of "/select" to get the argo solr log messages pertaining to searches (as opposed to /update or /admin or ...).

I then tried to do clever greps to find the value for the Solr q params. It's certainly possible I blew it, but for 434 lines of log, I only found 89 q params and they're really not interesting.

I have attached: file with all the argo solr requests file with only the path "/select" argo solr requests file with query strings

argo.solr.select.log argo.q.terms.txt argo.solr.log from c argo.solr.log from d argo.solr.log from h

ndushay commented 7 months ago

This also implies 434 user queries in Argo for a week - that's not a lot for patterns of use. Tagging @andrewjbtw