mikemccand / stargazers-migration-test

Testing Lucene's Jira -> GitHub issues migration
0 stars 0 forks source link

WeightedSpanTermExtractor unnexessarily enforces rewrite for some SpanQueiries [LUCENE-8637] #636

Open mikemccand opened 5 years ago

mikemccand commented 5 years ago

Method mustRewriteQuery(SpanQuery) returns true for SpanPositionCheckQuery, SpanContainingQuery, SpanWithinQuery, and SpanBoostQuery, however, these queries do not require rewriting. One effect of this is e.g. that UnifiedHighlighter does not work with OffsetSource Postings and switches to Analysis which of course has consequences for performance.

I attach a patch for lucene version 7.6.0. I have not checked whether it breaks existing unit tests.


Legacy Jira details

LUCENE-8637 by Christoph Goller on Jan 14 2019, updated Mar 21 2022 Attachments: WeightedSpanTermExtractor.java Linked issues:

mikemccand commented 5 years ago

patched version of WeightedSpanTermExtractor

[Legacy Jira: Christoph Goller on Jan 14 2019]

mikemccand commented 2 years ago

[~goller@detego-software.de] Can you check if it breaks unit tests? Do you know how to run the whole suite?

[Legacy Jira: Marcus Eagan on Oct 08 2021]

mikemccand commented 2 years ago

Method mustRewriteQuery(SpanQuery) returns true for ... and SpanBoostQuery, however, these queries do not require rewriting. ...

In LUCENE-10477 we found that a SpanBoostQuery requires rewriting if it wraps a SpanMultiTermQueryWrapper query (but if the boost factor is 1.0f the rewriting doesn't happen due to a bug in SpanBoostQuery.rewrite i.e. the rewrite is a no-op when it shouldn't be).

[Legacy Jira: Christine Poerschke (@cpoerschke) on Mar 21 2022]