The precedence for the unknown (implied) operation is always lower than for AND and OR operations, as evidenced by how the current Apache Lucene library handles this. A simple TreeVisitor on a PrecedenceQueryParser shows the following:
The results are similar when the default operator is changed to AND.
This commit adds a fictitious token for the invisible implied operation, giving it the lowest precedence. This should ensure that a b AND c d correctly translates to a (b AND c) d, contrasting the current (a (b AND (c d)).
To make this work, the list of precedence items had to shrink quite a bit, as there were several tokens in there that do not require disambiguation, and did cause issues with the fact that there's no lookahead token for the implied operation.
In the cleanup of this list, we also noticed that precedence of the boost operator was configured incorrectly, so that was fixed as well. This ensures that +2^1 becomes +(2^1).
The precedence for the unknown (implied) operation is always lower than for AND and OR operations, as evidenced by how the current Apache Lucene library handles this. A simple
TreeVisitor
on aPrecedenceQueryParser
shows the following:The results are similar when the default operator is changed to AND.
This commit adds a fictitious token for the invisible implied operation, giving it the lowest precedence. This should ensure that
a b AND c d
correctly translates toa (b AND c) d
, contrasting the current(a (b AND (c d))
.To make this work, the list of precedence items had to shrink quite a bit, as there were several tokens in there that do not require disambiguation, and did cause issues with the fact that there's no lookahead token for the implied operation.
In the cleanup of this list, we also noticed that precedence of the boost operator was configured incorrectly, so that was fixed as well. This ensures that +2^1 becomes +(2^1).