mikemccand / stargazers-migration-test

Testing Lucene's Jira -> GitHub issues migration
0 stars 0 forks source link

TopDocsCollector Should Not Depend on Priority Queue [LUCENE-8877] #874

Open mikemccand opened 5 years ago

mikemccand commented 5 years ago

TopDocsCollector is tightly coupled to the notion of priority queue, which is not necessarily a good abstraction to have since the collector really just needs an interface to iterate on and hold docID and score, with possibly shard indexes.

 

We should rewrite this to a more simplistic interface with priority queue being the default implementation


Legacy Jira details

LUCENE-8877 by Atri Sharma (@atris) on Jun 24 2019, updated Jun 26 2019

mikemccand commented 5 years ago

Any thoughts on this? I am envisioning eventually getting to a state where the underlying data structure used is opaque to IndexSearcher API. This should allow an abstraction with high degree of flexibility

[Legacy Jira: Atri Sharma (@atris) on Jun 25 2019]

mikemccand commented 5 years ago

Abstraction increases complexity too, it feels reasonable to me that top-docs collectors are backed by a priority queue since this is the go-to data-stucture for top-k selection problems? If you need more flexibility, you could directly extends Collector as opposed to TopDocsCollector?

[Legacy Jira: Adrien Grand (@jpountz) on Jun 26 2019]