Open tclayton33 opened 1 year ago
Here's how I reproduced one of the examples- the "Central Station" one.
This is a title search, so lets look at the definition of a title search.
From this we know how to replicate the search in Solr.
q
to be our query, "Central Station", select the checkboxes for debugQuery
and edismax
, and finally, concatenate the list of title fields from above (do this programmatically unless you like removing all the quotes) and set that as qf
.The results should mostly match what was reported by the committee, as long as the data in the selected collection is the same as prod.
At the bottom of the results, Solr will provide a 'debug' section, which includes an 'explain' subsection. This shows the math that creates the scores that provide our relevancy. The two most important explanations are for the result the committee thinks should be higher (id '990000746450302486', score 264.82336), and for the result above it (id '990022621180302486', score 270.9209). Full and truncated explanations are in Sharepoint.
The main takeaway is that the lower ranking result has more text in its description than the higher ranking one, and so "Central Station" is less of that item's description than it is for the higher ranking result.
@abelemlih I am surprised to see all_text_timv
affecting the score of a title search. Can you look into that? I should note that I cannot get similar results from Solr if pf=''
is set, as in the search definition.
@tclayton33 I must warn you and the committee: changes in relevance will have knock-on effects. We cannot boost one term without effectively de-boosting all the rest. If the committee is happy with results generally, there is no way to change boosts without changing those other results.
@rotated8 @abelemlih The committee did discuss that any changes we make in this area could have undesirable consequences, and we do want to prevent that. But, a considerable number of members are also dissatisfied with some of the results for short, exact titles. I just added a new example that came in from a faculty member last week (no.7). I've also run some comparable searches in Stanford's catalog and added those links to the example document. I'm not sure what Stanford is doing (it may be a lot more complicated than boosting the one field the committee was proposing), but I think their title search results for al-Khaṣāʼiṣ , JAMA, Radiographics, and Traditio are more in line with the behavior our users are expecting.
It's hard to talk about possible consequences in the abstract. We were hoping to ultimately alleviate the concern of creating unfavorable consequences by conducting thorough testing in the blackcat-test environment. Because production and blackcat-test are using the same indexes, the committee members would be able to run side by side comparisons to make sure the results are acceptable...or not.
@abelemlih and @rotated8 Pardon my newbie question, but that score that Ayoub assigned is just to complete the research for the spike, correct?
I have several examples from the Library Search Committee of title searches (mostly of short, exact titles) that are producing unsatisfactory results. I'd like to have a discussion with the developers to explore what could happen if the precise title fields is boosted.
Here is the example file