Closed sonika-shah closed 1 week ago
The Java checkstyle failed.
Please run mvn spotless:apply
in the root of your repository and commit the changes to this PR.
You can also use pre-commit to automate the Java code formatting.
You can install the pre-commit hooks with make install_test precommit_install
.
@sonika-shah lets try the following
"description": {
"type": "text",
"analyzer": "om_analyzer",
"term_vector": "with_positions_offsets"
}
private static HighlightBuilder buildHighlights(List<String> fields) {
List<String> defaultFields = List.of(FIELD_DISPLAY_NAME, FIELD_DESCRIPTION, FIELD_DISPLAY_NAME_NGRAM);
defaultFields = Stream.concat(defaultFields.stream(), fields.stream()).toList();
HighlightBuilder hb = new HighlightBuilder();
for (String field : defaultFields) {
HighlightBuilder.Field highlightField = new HighlightBuilder.Field(field);
highlightField.highlighterType("fvh");
hb.field(highlightField);
}
hb.preTags(PRE_TAG);
hb.postTags(POST_TAG);
return hb;
}
@harshach , we could go by the , "term_vector": "with_positions_offsets" also, but the only thing is it will increase the storage space, as each document will have term vectors stored alongside the actual data
while with the max_analysed_offset query parameter we could directly truncate the size of doc text we try highlighting on rather than requiring you to reindex or up the limit set on the index. It's a better overall solution. Discussion in the thread here : https://discuss.elastic.co/t/for-large-texts-indexing-with-offsets-or-term-vectors-is-recommended/266115/2
Issues
0 New issues
0 Accepted issues
Measures
0 Security Hotspots
0.0% Coverage on New Code
0.0% Duplication on New Code
Describe your changes:
cherry-picked in 1.5.11
Fixes an issue with highlighting large text fields in OpenSearch where we were hitting a
max_analyzed_offset
error due to highlight size limits .Solution : Setting
max_analyzed_offset
directly in theHighlightBuilder
at the query levelError Message:
Both OpenSearch and Elasticsearch have slightly different ways to set this: ElasticSearch:
OpenSearch :
hb.maxAnalyzedOffset(MAX_ANALYZED_OFFSET);
hb.maxAnalyzerOffset(MAX_ANALYZED_OFFSET);
Solution from the discussion on these GitHub issues:
#
Type of change:
#
Checklist:
Fixes <issue-number>: <short explanation>