apache / lucene

Apache Lucene open-source search software
https://lucene.apache.org/
Apache License 2.0
2.68k stars 1.04k forks source link

remove Collector specializations [LUCENE-5229] #6293

Open asfimport opened 11 years ago

asfimport commented 11 years ago

There are too many collector specializations (i think 16 or 18?) and too many crazy defaults like returning NaN scores to the user by default in indexsearcher.

this confuses hotspot (I will ignore any benchmarks posted here where only one type of sort is running thru the JVM, thats unrealistic), and confuses users with stuff like NaN scores coming back by default.

I have two concerete suggestions:


Migrated from LUCENE-5229 by Robert Muir (@rmuir) Sub-tasks:

asfimport commented 11 years ago

Shai Erera (@shaie) (migrated from JIRA)

nuke doMaxScores. its implicit from doScores

+1, if you ask to compute scores, you might as well get maxScore. I doubt that specialization is so important.

change doScores to true by default in indexsearcher

I'm not sure about it. I wasn't confused by the fact that I received NaN, only pointed out that when you use Expression, the result is not in the 'score' field, but the 'field' field. I think that in most cases, if you sort, you're interested in the sort-by value, not the score. Not sure if it buys performance or not, but I think it's just redundant work.

asfimport commented 11 years ago

Robert Muir (@rmuir) (migrated from JIRA)

I wasn't confused by the fact that I received NaN, only pointed out that when you use Expression, the result is not in the 'score' field, but the 'field' field.

You invoked IndexSearcher.search(query, filter, n, Sort) and you were surprised that the result of the sort goes there?

I think this kinda stuff only furthers to reinforce my argument that this stuff is way too specialized and complicated.

asfimport commented 11 years ago

Robert Muir (@rmuir) (migrated from JIRA)

nuke doMaxScores. its implicit from doScores

+1, if you ask to compute scores, you might as well get maxScore. I doubt that specialization is so important.

I will split off a subtask for this since I dont think its controversial. I at least want to make some progress on this. Removing confusing booleans from the API of indexsearcher is also huge to me: and this will take care of one.