Open asfimport opened 5 years ago
Jeremie Miserez (migrated from JIRA)
Thanks, had the same issue. Two comments concerning the patch:
1) There is an additional case a few lines above that results in the same NPE (when there are no clauses at all and allSpanClauses.length == 0):
if (numNegatives == 0) {
// The simple case - no negative elements in phrase
return new SpanNearQuery(allSpanClauses, slopFactor, inOrder);
}
which would also need to be fixed the same way:
if (numNegatives == 0) {
// The simple case - no negative elements in phrase
if (allSpanClauses.length == 0) {
// Invent a positive clause out of thin air.
return new SpanTermQuery(new Term(field,
"Dummy clause because no terms found - must match nothing"));
}
return new SpanNearQuery(allSpanClauses, slopFactor, inOrder);
}
2) You mention "a single synthetic clause that matches either everything or nothing". I tested this with phrase queries and it seems to make no difference. However, it does indeed make a difference in the case where stop-words or special characters are stripped away by the Analyzer during query parsing/rewrite. The QueryParserBase#getBooleanQuery() method has a comment to that effect:
protected Query getBooleanQuery(List<BooleanClause> clauses) throws ParseException {
if (clauses.size()==0) {
return null; // all clause words were filtered away by the analyzer.
}
// ...
While returning a "*" wildcard query or a MatchAllDocsQuery or similar will work to prevent the NPE, it will yield wrong results for phrase queries or other queries: searching for a special char or stopword will then match everything which would be incorrect. So matching nothing like in the proposed patch is most likely the correct solution.
Requesting this URL in SOLR gives a 500 error with a stack trace pointing to Lucene:
http://localhost:8983/solr/films/select?q=\{!complexphrase}genre:"-om*"
The stack trace is (cut down to the reasonably relevant part):
{{java.lang.NullPointerException\n\tat java.util.TreeMap.getEntry(TreeMap.java:347) at java.util.TreeMap.get(TreeMap.java:278) at org.apache.lucene.codecs.perfield.PerFieldPostingsFormat$FieldsReader.terms(PerFieldPostingsFormat.java:311) at org.apache.lucene.index.CodecReader.terms(CodecReader.java:106) at org.apache.lucene.index.FilterLeafReader.terms(FilterLeafReader.java:351) at org.apache.lucene.index.ExitableDirectoryReader$ExitableFilterAtomicReader.terms(ExitableDirectoryReader.java:91) at org.apache.lucene.search.spans.SpanNearQuery$SpanNearWeight.getSpans(SpanNearQuery.java:208) at org.apache.lucene.search.spans.SpanNotQuery$SpanNotWeight.getSpans(SpanNotQuery.java:127) at org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:135) at org.apache.lucene.search.spans.SpanWeight.scorer(SpanWeight.java:46) at org.apache.lucene.search.Weight.bulkScorer(Weight.java:177) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:649) at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443) at org.apache.solr.search.SolrIndexSearcher.buildAndRunCollectorChain(SolrIndexSearcher.java:200) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1604)}}
The error is actually a bit deeper and can be traced back to the o.a.l.queryparser.complexPhrase.ComplexPhraseQueryParser class.
Handling this query involves constructing a SpanQuery, which happens in the rewrite method of ComplexPhraseQueryParser. In particular, the expression is decomposed into a BooleanQuery, which has exactly one clause, namely the negative clause -genre:”om*”. The rewrite method then further transforms this into a SpanQuery; in this case, it goes into the path that handles complex queries with both positive and negative clauses. It extracts the subset of positive clauses - note that this set of clauses is empty for this query. The positive clauses are then combined into a SpanNearQuery (around line 340), which is then used to build a SpanNotQuery. Further down the line, the field attribute of the SpanNearQuery is accessed and used as an index into a TreeMap. But since we had an empty set of positive clauses, the SpanNearQuery does not have its field attribute set, so we get a null here - this leads to an exception. A possible fix would be to detect the situation where we have an empty set of positive clauses and include a single synthetic clause that matches either everything or nothing. See attached file 0001-Fix-NullPointerException.patch.
This bug was found using Diffblue Microservices Testing. Find more information on this test campaign.
Migrated from LUCENE-8666 by Johannes Kloos, 1 vote, updated Apr 01 2019 Environment:
Attachments: 0001-Fix-NullPointerException.patch, home.zip