Conal-Tuohy / swinburne

Algernon Charles Swinburne website
4 stars 0 forks source link

Missing highlights #15

Closed jawalsh closed 3 years ago

jawalsh commented 3 years ago

Query: http://swinburne-dev.luddy.indiana.edu/search/?text=pleasure+pain&title=%22a+match%22

This query results in 3 results snippets, but only one is highlighted in the full-text view.

Screen Shot 2021-03-18 at 10 00 44 AM Screen Shot 2021-03-18 at 10 01 40 AM
Conal-Tuohy commented 3 years ago

It took me a while to track down the cause of this bug, but I believe I have diagnosed the problem now.

The bug appears to be in the hit-highlighting stylesheet: the Solr query itself works and returns the correct number of matching "snippets", but the stylesheet which locates and highlights those snippets in the page of text does not necessarily work. The stylesheet effectively assumes that the snippets returned from Solr are in document order, and normally they are, but in general they are ordered by Solr's "relevance" ranking, so in fact Solr may return the snippets in a different order. In the case above, the snippet "If you were queen of pleasure / And I were king of pain" is returned as the first matching snippet, and the stylesheet finds that snippet in the text, and then fails to find the remaining two snippets occurring afterward.