Closed jawalsh closed 3 years ago
I added a utility stylesheet containing a function to abbreviate those unusually large snippets, and I imported the stylesheet and called the function from the stylesheet which renders search results, and also from the stylesheet that marks up hit highlights in the texts themselves.
The leading and trailing parts of each snippet are cut down a maximum number of words (currently 20, but modifiable: https://github.com/Conal-Tuohy/swinburne/commit/96e4fd1c220e6f4d19257f7cd8439a1f66a0b2fa#diff-9e82627805270822add84f08f673abf00672d779f83a9044452a14f37ec3d261R22-R23)
It's a definite improvement over Solr's native segmentation, though it could probably be improved by taking punctuation markers as hints for appropriate snippet boundaries.
Highlighted full-text results frequently include one or more unusually large highlighted snippets. Often this large snippet is the first snippet. Some examples: