Closed mikemccand closed 5 years ago
@klaporte did you try this PhraseWildcardQuery? Do you have some feedback about it?
We will probably move it to lucene/sandbox.
[Legacy Jira: Bruno Roustant (@bruno-roustant) on Nov 14 2019]
Hi @bruno-roustant. I don't yet. The team we're working with is reluctant to make modifications to the software at this point as they have released to their beta clients. At present, we've shifted to testing this internally in the hopes of making progress there.
[Legacy Jira: Ken LaPorte on Nov 14 2019]
I updated the PR.
Summary of the decision taken in the PR (see there for explanations):
[Legacy Jira: Bruno Roustant (@bruno-roustant) on Nov 21 2019]
I'll merge this PR within 2 days if there is no objection.
[Legacy Jira: Bruno Roustant (@bruno-roustant) on Nov 25 2019]
Commit 8485b5a939c5ffc4982dd338d59cdf090c5e1e58 in lucene-solr's branch refs/heads/master from Bruno Roustant https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8485b5a
LUCENE-8983: Add PhraseWildcardQuery to control multi-terms expansions in phrase.
[Legacy Jira: ASF subversion and git services on Nov 27 2019]
Commit d764bf345e2789589fbead7df5838dc20247c577 in lucene-solr's branch refs/heads/branch_8x from Bruno Roustant https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=d764bf3
LUCENE-8983: Add PhraseWildcardQuery to control multi-terms expansions in phrase.
[Legacy Jira: ASF subversion and git services on Nov 27 2019]
This change seems to be causing test failures, e.g. https://jenkins.thetaphi.de/job/Lucene-Solr-8.x-Solaris/431/
[Legacy Jira: Adrien Grand (@jpountz) on Nov 28 2019]
This change seems to be causing test failures
Looking into this.
[Legacy Jira: Bruno Roustant (@bruno-roustant) on Nov 28 2019]
The randomization made only one segment while I thought I ensured 2 segments even with randomization. To make the test more robust, I improved it to skip special segment test counters and just focus on query results and scores if there are not exactly 2 segments.
[Legacy Jira: Bruno Roustant (@bruno-roustant) on Nov 28 2019]
Commit 8bd5d7dd2edacc096805e9519656504f29ebd04e in lucene-solr's branch refs/heads/master from Bruno Roustant https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=8bd5d7d
LUCENE-8983: TestPhraseWildcardQuery more robust wrt randomization.
[Legacy Jira: ASF subversion and git services on Nov 28 2019]
Commit e35de979916774937d854009387208c200f35584 in lucene-solr's branch refs/heads/branch_8x from Bruno Roustant https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=e35de97
LUCENE-8983: TestPhraseWildcardQuery more robust wrt randomization.
[Legacy Jira: ASF subversion and git services on Nov 28 2019]
Thanks for looking so quickly @bruno-roustant.
[Legacy Jira: Adrien Grand (@jpountz) on Nov 28 2019]
Closing after the 8.4.0 release.
[Legacy Jira: Adrien Grand (@jpountz) on Dec 29 2019]
A generalized version of PhraseQuery, built with one or more MultiTermQuery that provides term expansions for multi-terms (one of the expanded terms must match).
Its main advantage is to control the total number of expansions across all MultiTermQuery and across all segments.
This query is similar to MultiPhraseQuery, but it handles, controls and optimizes the multi-term expansions.
This query is equivalent to building an ordered SpanNearQuery with a list of SpanTermQuery and SpanMultiTermQueryWrapper. But it optimizes the multi-term expansions and the segment accesses. It first resolves the single-terms to early stop if some does not match. Then it expands each multi-term sequentially, stopping immediately if one does not match. It detects the segments that do not match to skip them for the next expansions. This often avoid expanding the other multi-terms on some or even all segments. And finally it controls the total number of expansions.
Legacy Jira details
LUCENE-8983 by Bruno Roustant (@bruno-roustant) on Sep 18 2019, resolved Nov 28 2019 Linked issues: