Open kkraune opened 1 year ago
For the record, queries with a bag of words (not intended as a phrase) can be written as
vespa query 'select * from music where {stem:false}userInput(@q)' q="paneer butter masala" tracelevel=3
and the stem-annotation works as intended:
{
"message": "sc0.num0 search to dispatch: query=[WEAKAND(100) default:paneer default:butter default:masala] timeout=9998ms offset=0 hits=10 groupingSessionCache=true sessionId=2cf3dee6-68ad-4ecd-85b8-107639e3c133.1688645538004.71.default grouping=0 : restrict=[music]"
},
{
"message": "Current state of query tree: WEAKAND[N=100]{\n WORD[fromSegmented=false index=\"default\" origin=\"(0 6)\" segmentIndex=0 stemmed=true uniqueID=1 words=true]{\n \"paneer\"\n }\n WORD[fromSegmented=false index=\"default\" origin=\"(7 13)\" segmentIndex=0 stemmed=true uniqueID=2 words=true]{\n \"butter\"\n }\n WORD[fromSegmented=false index=\"default\" origin=\"(14 20)\" segmentIndex=0 stemmed=true uniqueID=3 words=true]{\n \"masala\"\n }\n}\n" },
Describe the bug Stemming of single terms / phrases is inconsistent / confusing - missing documentation
To Reproduce Using https://docs.vespa.ai/en/vespa-quick-start.html
This looks right -
vespa query 'select * from music where album contains ({stem:false}"paneer")' tracelevel=3 language=en-US
outputsTrying a phrase:
vespa query 'select * from music where album contains ({stem:false}"paneer butter masala")' tracelevel=3 language=en-US
outputsIn this case, the {stem:false} annotation does not work - "paneer" is stemmed to "pan". I think this is because one cannot use stem on phrase:
vespa query 'select * from music where album contains ({stem:false}phrase("paneer", "butter", "masala"))' tracelevel=3 language=en-US
Per documentation,
phrase
takes no annotations, and that is consistent with the behavior above (assuming SPHRASE and PHRASE behaves the same).So we must either document better that an implicit phrase cannot disable stemming, or change phrase operators to support
stem: false
Vespa version 8.188.15