A Time-limiting collector that works with CollectorManagers [LUCENE-8319]

mikemccand commented 6 years ago

Currently Lucene has TimeLimitingCollector to support time-bound collection and it will throw TimeExceededException if timeout happens. This only works nicely with the single-thread low-level API from the IndexSearcher. The method signature is –

void search(List<LeafReaderContext> leaves, Weight weight, Collector collector)

The intended use is to always enclose the searcher.search(query, collector) call with a try ... catch and handle the timeout exception. Unfortunately when working with a CollectorManager in the multi-thread search context, the TimeExceededException thrown during collecting one leaf slice will be re-thrown by IndexSearcher without calling CollectorManager's reduce(), even if other slices are successfully collected. The signature of the search api with CollectorManager is –

<C extends Collector, T> T search(Query query, CollectorManager<C, T> collectorManager) The good news is that IndexSearcher handles CollectionTerminatedException gracefully by ignoring it. We can either wrap TimeLimitingCollector and throw CollectionTerminatedException when timeout happens or simply replace TimeExceededException with CollectionTerminatedException. In either way, we also need to maintain a flag that indicates if timeout occurred so that the user know it's a partial collection.

Legacy Jira details

LUCENE-8319 by Tony Xu on May 16 2018, updated Sep 02 2020

mikemccand commented 6 years ago

I wonder if we could have TimeExceeededException extend CollectionTerminatedException – if that fixes the issue, it would be a pretty small change, and I don't think there are any cases where you would want to catch CTE and not catch TEE ?

[Legacy Jira: Michael Sokolov (@msokolov) on May 18 2018]

mikemccand commented 6 years ago

I wonder if we could have TimeExceeededException extend CollectionTerminatedException

I think that's a good approach? They both extend RuntimeException today. And then we could add a getter on TimeLimitingCollector to see if a timeout occurred.

[Legacy Jira: Michael McCandless (@mikemccand) on May 21 2018]

mikemccand commented 4 years ago

Note that there is also ExitableDirectoryReader which seems to compete with TimeLimitingCollector. IMO EDR is better because it extends earlier to query rewrite phase, and TLC has maybe no advantages? I'd rather see TLC removed. Any way, I bring this up because I'm not sure how EDR plays with concurrent search. Maybe just fine, maybe there is a parallel concern there with the proposal above.

[Legacy Jira: David Smiley (@dsmiley) on Aug 12 2020]

mikemccand commented 4 years ago

A problem with TimeLimitingCollector and ExitableDirectoryReader is that they add layers of abstraction to things that are called in very tight loops. One combination that we found to work well for Elasticsearch is to use ExitableDirectoryReader only for terms/points and make IndexSearcher wrap the top-level bulk scorer to split the doc ID space in exponentially growing windows of doc IDs and check the timeout between windows in order to keep the overhead to a minimum. Timeout handling seems to be a frequent need so maybe we should add support for it directly on IndexSearcher where we could more easily do the right thing?

[Legacy Jira: Adrien Grand (@jpountz) on Aug 29 2020]

mikemccand commented 4 years ago

Timeout handling seems to be a frequent need so maybe we should add support for it directly on IndexSearcher where we could more easily do the right thing?

+1 to implement this in IndexSearcher, carefully. The exponential backoff sounds like a great approach!

[Legacy Jira: Michael McCandless (@mikemccand) on Sep 02 2020]

mikemccand / stargazers-migration-test

A Time-limiting collector that works with CollectorManagers [LUCENE-8319] #319

Legacy Jira details