apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.64k stars 1.02k forks source link

UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal [LUCENE-8674] #9720

Open asfimport opened 5 years ago

asfimport commented 5 years ago

Requesting the following URL causes Solr to return an HTTP 500 error response:

http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by

The error response seems to be caused by the following uncaught exception:

java.lang.UnsupportedOperationException
at org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47)
at org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188)
at org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53)
at org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89)
at org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77)
at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261)
at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214)
at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151)
at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140)
at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177)
at org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817)
at org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1025)
at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540)
at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420)
at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567)
at org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1434)

Sadly, I can't understand the logic of this code well enough to give any insights.

To set up an environment to reproduce this bug, follow the description in the ‘Environment’ field.

We found this issue and \~70 more like this using Diffblue Microservices Testing. Find more information on this fuzz testing campaign.


Migrated from LUCENE-8674 by Johannes Kloos, updated Mar 10 2020 Environment:

h1. Steps to reproduce

* Use a Linux machine.
*  Build commit {{ea2c8ba}} of Solr as described in the section below.
* Build the films collection as described below.
* Start the server using the command {{./bin/solr start -f -p 8983 -s /tmp/home}}
* Request the URL given in the bug description.

h1. Compiling the server

{noformat}
git clone https://github.com/apache/lucene-solr
cd lucene-solr
git checkout ea2c8ba
ant compile
cd solr
ant server
{noformat}

h1. Building the collection and reproducing the bug

We followed [Exercise 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from the [Solr Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html].

{noformat}
mkdir -p /tmp/home
echo '<?xml version="1.0" encoding="UTF-8" ?><solr></solr>' > /tmp/home/solr.xml
{noformat}

In one terminal start a Solr instance in foreground:
{noformat}
./bin/solr start -f -p 8983 -s /tmp/home
{noformat}

In another terminal, create a collection of movies, with no shards and no replication, and initialize it:

{noformat}
bin/solr create -c films
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
./bin/post -c films example/films/films.json
curl -v “URL_BUG”
{noformat}

Please check the issue description below to find the “URL_BUG” that will allow you to reproduce the issue reported.
asfimport commented 4 years ago

Rahul Yadav (migrated from JIRA)

Hi ,

NewDev here , can i take up this issue?

asfimport commented 4 years ago

Rahul Yadav (migrated from JIRA)

I am looking at this

asfimport commented 4 years ago

Rahul Yadav (migrated from JIRA)

I was able to reproduce the issue.Getting the exception as described.

Analysis on-going

asfimport commented 4 years ago

Michele Palmia (@micpalmia) (migrated from JIRA)

This is due to a VectorValueSource being fed to a FunctionRangeQuery, that is therefore trying to use its floatVal. By default, requesting the floatVal(int doc) of a VectorValueSource throws an UnsupportedOperationException, since no algorithm for merging the (possibly multiple) values is implemented.

For reference, the query Solr tries to do is the following,

new ConstantScoreQuery(
    new FunctionRangeQuery(
        new VectorValueSource(
            new BytesRefFieldSource("any_field"),
            new SortedSetFieldSource("another_field")
        ), 0, 100, true, true));

that always throws an exception if there are documents in the index.

From the way it's implemented (with the UnsupportedOperationException) it doesn't look like this kind of inconsistencies are meant to be fixed in Lucene. But not sure about that.

Any suggestions are appreciated!

asfimport commented 4 years ago

David Smiley (@dsmiley) (migrated from JIRA)

I don't think VectorValueSource is involved here since it's only used by some spatial stuff in Solr.  The Environment/description info doesn't suggest spatial is used at all.

asfimport commented 4 years ago

Michele Palmia (@micpalmia) (migrated from JIRA)

The problematic query ( ?fq=\{!frange l=10 u=100}or_version_s,directed_by ) specifies two value sources separated by a comma (or_version_s,directed_by). These are parsed as a VectorValueSource embedding the two individual ValueSources corresponding to the two fields (see FunctionQParser.java:115).