Open eroux opened 3 months ago
this query seems to be working:
{
"query": {
"bool": {
"should": [
{
"has_child": {
"type": "etext",
"query": {
"nested": {
"path": "chunks",
"query": {
"match_phrase": {
"chunks.text_bo": "བྱ་ངང་བ་སེར་བོ་མཚོ་"
}
},
"inner_hits": {
"highlight": {
"fields": {
"chunks.text_bo": {
"highlight_query": {
"match_phrase": {
"chunks.text_bo": "བྱ་ངང་བ་སེར་བོ་མཚོ་"
}
}
}
}
}
}
}
},
"inner_hits": {
"_source": {
"includes": ["id"]
},
"highlight": {
"fields": {
"chunks.text_bo": {
"highlight_query": {
"match_phrase": {
"chunks.text_bo": "བྱ་ངང་བ་སེར་བོ་མཚོ་"
}
}
}
}
}
}
}
}
]
}
}
}
The current way OpenSearch highlights etext fields (and other fields too I guess) is to highlight all the tokens in the result that are in the query, which means that if the query contains "pa", it will highlight all the "pa" in the etext result, making it very noisy. It should make its best to only highlight tokens matching the query.
Two possible strategies are: