Open asfimport opened 6 years ago
Jim Ferenczi (@jimczi) (migrated from JIRA)
Since it is possible to index suggestions with or without separators (preservePositionIncrements) the context query adds an optional separator after the context automaton. This character is optional so the regex "." can match the context plus the separator label but nothing from the real suggestions. Completion queries should always match a prefix from the suggestions (hence the assertion) but it doesn't handle regex that starts with ".". I've attached a patch to fix the issue that adds a parameter in the ContextQuery constructor to indicate if suggestions are indexed with position increments or not.This is a breaking change since it requires to match the value used for indexing but I don't see how to do it differently if we want to match regex that starts with any character accurately (e.g.: ".[s|t]").
Julie Tibshirani (@jtibshirani) (migrated from JIRA)
Thank you @jimczi! I'm looking forward to this fix going in, as we're hoping to use a "." regex completion query as part of an elasticsearch enhancement (https://github.com/elastic/elasticsearch/issues/19147),).
Jim Ferenczi (@jimczi) (migrated from JIRA)
I attached a new patch that fixes the name of the option (preserveSep is the correct option, preservePositionIncrements is for keeping holes when tokens are removed by a filter). I'll commit next week if there is no objection.
When a RegexCompletionQuery of "." is provided to ContextQuery, the following assertion failure occurs:
Note that this is a related, but distinct issue from #9334, where the RegexCompletionQuery is empty.
The attached patch provides a reproduction of the issue, as the test case TestContextQuery#testRegexDotQuery. To reproduce, Java assertions must be enabled (as in the default configuration for tests).
The patch also provides a test case for the normal behavior of an empty RegexCompletionQuery, when it is not wrapped in ContextQuery (TestRegexCompletionQuery#testRegexDotQuery). In this case, there is no error, and all suggestions are returned.
From a quick look, it seems as though "." doesn't capture any characters past CompletionAnalyzer.SEP_LABEL, so the matching prefix in ContextCompletionWeight#setInnerWeight is unexpectedly empty.
Migrated from LUCENE-8288 by Julie Tibshirani (@jtibshirani), updated May 07 2018 Attachments: LUCENE-8288.patch (versions: 2), LUCENE-8288-repro.patch