Open pbvahlst opened 5 years ago
Pinging @elastic/es-search
@slumx thanks for your interest in elasticsearch. I doubt that this issue relates to the JVM version, but nevertheless, it could be good to double check that it reproduces on a supported JVM version.
My bad, it is running 1.8.0_152 (the included one)
Can you test with the unified
highlighter ? It can also use terms_vector
so the performance should be similar. The fast_vector highlighter is not actively maintained in Lucene and the unified
highlighter was added to replace the old ones so it might be faster for you to switch to this new highlighter rather than waiting for a bug resolution.
Ok I will try that. The reason that I use FVH is that it seems to be the only highlighter which can combine fields analyzed with different analyzers into one field, isn't this still the case? (We use this feature a lot).
The reason that I use FVH is that it seems to be the only highlighter which can combine fields analyzed with different analyzers into one field, isn't this still the case? (We use this feature a lot).
This is not implemented yet which is the reason why we keep the fast_vector highlighter for now. I'll try to reproduce the bug to see if the fix is simple.
Is there any possibility of supporting this in the future?
The issue is still present when in Elasticsearch 8.12 when fvh
highlighter is used, but everything works well when unified
highlighter is used:
PUT index1
{
"settings": {
"analysis": {
"filter": {
"my_synonyms_filter": {
"type": "synonym_graph",
"synonyms": [
"face hugger, facehugger, alien",
"easter eg, groundhog day"
]
}
},
"analyzer": {
"my_analyzer": {
"tokenizer": "standard",
"filter": [ "my_synonyms_filter" ]
}
}
}
},
"mappings": {
"properties": {
"content": {
"type": "text",
"search_analyzer": "my_analyzer",
"term_vector": "with_positions_offsets"
}
}
}
}
POST index1/_bulk?refresh=true
{ "index" : {"_id": 1} }
{ "content" : "face hugger resurrection"}
Using unified
highlighter returns the expected results:
{
"highlight": {
"fields": {
"content": {
"type" : "fvh"
}
}
},
"query": {
"match_phrase" : {
"content": "alien resurrection"
}
}
}
"hits": [
{
"_index": "index1",
"_id": "1",
"_score": 0.8630463,
"_source": {
"content": "face hugger resurrection"
},
"highlight": {
"content": [
"<em>face</em> <em>hugger</em> <em>resurrection</em>"
]
}
}
]
But when using fvh
highlighter, no results are returned:
{
"highlight": {
"fields": {
"content": {
"type" : "fvh"
}
}
},
"query": {
"match_phrase" : {
"content": "alien resurrection"
}
}
}
"hits": [
{
"_index": "index1",
"_id": "1",
"_score": 0.8630463,
"_source": {
"content": "face hugger resurrection"
}
}
]
Pinging @elastic/es-search-relevance (Team:Search Relevance)
Describe the feature:
Elasticsearch version (
bin/elasticsearch --version
): 7.3Plugins installed: [icu]
JVM version (
java -version
): 9.0.1OS version (
uname -a
if on a Unix-like system): Win10Description of the problem including expected versus actual behavior: Doing a phrase search with
query_string
e.g. on e.g. "alien resurrection" does not get highligted ifgraph_synonyms
filter is enabled and the the synonym list contains multi term synonyms including one of the terms from the phrase search. In some cases it seems to work partially:Steps to reproduce:
Please include a minimal but complete recreation of the problem, including (e.g.) index creation, mappings, settings, query etc. The easier you make for us to reproduce it, the more likely that somebody will take the time to look at it.
search_analyzer
and add a multi term synonym for "face hugger, alien"Provide logs (if relevant):