By setting hl.maxAnalyzedChars to a high value we avoid missing highlight matches where the content of the full_text_tesimv field is longer than the default value of 52100 characters. The largest value in full_text_tesimv is roughly 12 million characters long.
Ideally we'd set hl.maxAnalyzedChars to -1 as a shortcut to set the value to the max integer value, but there is bug in Solr's UnifiedHighlighter that causes an error, see: https://issues.apache.org/jira/browse/SOLR-13121
I reduced the hl.fragsize from the default of 100 because the resulting snippets were longer than with the previous highlight settings.
Fixes #487
This sets the highlight method to unified, which is the default in Solr 9, but not in Solr 8 (which we are running in production). The
*_tesimv
field is already appropriately configured to take advantage of this more efficient highlighting method, see: https://solr.apache.org/guide/solr/latest/query-guide/highlighting.html#schema-options-and-performance-considerationsBy setting
hl.maxAnalyzedChars
to a high value we avoid missing highlight matches where the content of thefull_text_tesimv
field is longer than the default value of 52100 characters. The largest value infull_text_tesimv
is roughly 12 million characters long.Ideally we'd set
hl.maxAnalyzedChars
to -1 as a shortcut to set the value to the max integer value, but there is bug in Solr'sUnifiedHighlighter
that causes an error, see: https://issues.apache.org/jira/browse/SOLR-13121I reduced the
hl.fragsize
from the default of 100 because the resulting snippets were longer than with the previous highlight settings.