The language annotation is applied once even though multiple ones are provided and as a result, the search query is stemmed just once.
To Reproduce
Schema:
schema items {
document items {
field language type string {
indexing: set_language | summary | attribute
attribute {
fast-access
fast-search
}
rank: filter
}
field title type string {
indexing: summary | index
match: text
}
}
fieldset default {
fields: title
}
}
{
"yql": "select * from items where ({language: 'fr', grammar: 'all'}userInput(@q)) or ({language: 'en', grammar: 'all'}userInput(@q))",
"q": "machine learning",
"trace.level": 3
}
By inspecting traces, I can see only a single trace telling that both of the query operators were stemmed using French
{
"message": "Stemming with language=FRENCH"
}
When I swap the order of languages, it would stem only with English and both of the query operators would be stemmed using English:
{
"message": "Stemming with language=ENGLISH"
}
Expected behavior
Language query annotation applied per operator basis.
Environment (please complete the following information):
OS: Docker
Infrastructure: Localhost
Versions n/a
Vespa version
8.308.26
Additional context
This can be implemented using searchers but this can be challenging for non-engineers, especially data scientists who usually know Python really well, but not Java.
Describe the bug
The
language
annotation is applied once even though multiple ones are provided and as a result, the search query is stemmed just once.To Reproduce
Schema:
services.xml
Vespa search request:
By inspecting traces, I can see only a single trace telling that both of the query operators were stemmed using French
When I swap the order of languages, it would stem only with English and both of the query operators would be stemmed using English:
Expected behavior
Language query annotation applied per operator basis.
Environment (please complete the following information):
Vespa version 8.308.26
Additional context
This can be implemented using searchers but this can be challenging for non-engineers, especially data scientists who usually know Python really well, but not Java.