mikemccand / stargazers-migration-test

Testing Lucene's Jira -> GitHub issues migration
0 stars 0 forks source link

UnsupportedOperationException due to call to o.a.l.q.f.FunctionValues.floatVal [LUCENE-8674] #673

Open mikemccand opened 5 years ago

mikemccand commented 5 years ago

Requesting the following URL causes Solr to return an HTTP 500 error response:

http://localhost:8983/solr/films/select?fq={!frange%20l=10%20u=100}or_version_s,directed_by

The error response seems to be caused by the following uncaught exception:

java.lang.UnsupportedOperationException
at org.apache.lucene.queries.function.FunctionValues.floatVal(FunctionValues.java:47)
at org.apache.lucene.queries.function.FunctionValues$3.matches(FunctionValues.java:188)
at org.apache.lucene.queries.function.ValueSourceScorer$1.matches(ValueSourceScorer.java:53)
at org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.doNext(TwoPhaseIterator.java:89)
at org.apache.lucene.search.TwoPhaseIterator$TwoPhaseIteratorAsDocIdSetIterator.nextDoc(TwoPhaseIterator.java:77)
at org.apache.lucene.search.Weight$DefaultBulkScorer.scoreAll(Weight.java:261)
at org.apache.lucene.search.Weight$DefaultBulkScorer.score(Weight.java:214)
at org.apache.lucene.search.BulkScorer.score(BulkScorer.java:39)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:652)
at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:443)
at org.apache.solr.search.DocSetUtil.createDocSetGeneric(DocSetUtil.java:151)
at org.apache.solr.search.DocSetUtil.createDocSet(DocSetUtil.java:140)
at org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:1177)
at org.apache.solr.search.SolrIndexSearcher.getPositiveDocSet(SolrIndexSearcher.java:817)
at org.apache.solr.search.SolrIndexSearcher.getProcessedFilter(SolrIndexSearcher.java:1025)
at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1540)
at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:1420)
at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:567)
at org.apache.solr.handler.component.QueryComponent.doProcessUngroupedSearch(QueryComponent.java:1434)

Sadly, I can't understand the logic of this code well enough to give any insights.

To set up an environment to reproduce this bug, follow the description in the ‘Environment’ field.

We found this issue and ~70 more like this using Diffblue Microservices Testing. Find more information on this fuzz testing campaign.


Legacy Jira details

LUCENE-8674 by Johannes Kloos on Jan 31 2019, updated Mar 10 2020 Environment:

h1. Steps to reproduce

* Use a Linux machine.
*  Build commit {{ea2c8ba}} of Solr as described in the section below.
* Build the films collection as described below.
* Start the server using the command {{./bin/solr start -f -p 8983 -s /tmp/home}}
* Request the URL given in the bug description.

h1. Compiling the server

{noformat}
git clone https://github.com/apache/lucene-solr
cd lucene-solr
git checkout ea2c8ba
ant compile
cd solr
ant server
{noformat}

h1. Building the collection and reproducing the bug

We followed [Exercise 2|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html#exercise-2] from the [Solr Tutorial|http://lucene.apache.org/solr/guide/7_5/solr-tutorial.html].

{noformat}
mkdir -p /tmp/home
echo '<?xml version="1.0" encoding="UTF-8" ?><solr></solr>' > /tmp/home/solr.xml
{noformat}

In one terminal start a Solr instance in foreground:
{noformat}
./bin/solr start -f -p 8983 -s /tmp/home
{noformat}

In another terminal, create a collection of movies, with no shards and no replication, and initialize it:

{noformat}
bin/solr create -c films
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-field": {"name":"name", "type":"text_general", "multiValued":false, "stored":true}}' http://localhost:8983/solr/films/schema
curl -X POST -H 'Content-type:application/json' --data-binary '{"add-copy-field" : {"source":"*","dest":"_text_"}}' http://localhost:8983/solr/films/schema
./bin/post -c films example/films/films.json
curl -v “URL_BUG”
{noformat}

Please check the issue description below to find the “URL_BUG” that will allow you to reproduce the issue reported.
mikemccand commented 4 years ago

Hi ,

NewDev here , can i take up this issue?

[Legacy Jira: Rahul Yadav on Nov 23 2019]

mikemccand commented 4 years ago

I am looking at this

[Legacy Jira: Rahul Yadav on Nov 23 2019]

mikemccand commented 4 years ago

I was able to reproduce the issue.Getting the exception as described.

Analysis on-going

[Legacy Jira: Rahul Yadav on Dec 01 2019]

mikemccand commented 4 years ago

This is due to a VectorValueSource being fed to a FunctionRangeQuery, that is therefore trying to use its floatVal. By default, requesting the floatVal(int doc) of a VectorValueSource throws an UnsupportedOperationException, since no algorithm for merging the (possibly multiple) values is implemented.

For reference, the query Solr tries to do is the following,

new ConstantScoreQuery(
    new FunctionRangeQuery(
        new VectorValueSource(
            new BytesRefFieldSource("any_field"),
            new SortedSetFieldSource("another_field")
        ), 0, 100, true, true));

that always throws an exception if there are documents in the index.

From the way it's implemented (with the UnsupportedOperationException) it doesn't look like this kind of inconsistencies are meant to be fixed in Lucene. But not sure about that.

Any suggestions are appreciated!

[Legacy Jira: Michele Palmia (@micpalmia) on Mar 02 2020]

mikemccand commented 4 years ago

I don't think VectorValueSource is involved here since it's only used by some spatial stuff in Solr.  The Environment/description info doesn't suggest spatial is used at all.

[Legacy Jira: David Smiley (@dsmiley) on Mar 09 2020]

mikemccand commented 4 years ago

The problematic query ( ?fq=\{!frange l=10 u=100}or_version_s,directed_by ) specifies two value sources separated by a comma (or_version_s,directed_by). These are parsed as a VectorValueSource embedding the two individual ValueSources corresponding to the two fields (see FunctionQParser.java:115).

[Legacy Jira: Michele Palmia (@micpalmia) on Mar 10 2020]