Open mikemccand opened 5 years ago
I think the safest bet here is to run the rejected task on the caller thread. Any objections/thoughts?
[Legacy Jira: Atri Sharma (@atris) on Sep 25 2019]
Commit 15db6bfa88952cf0912b3c93d59c0cdc55bf9e2a in lucene-solr's branch refs/heads/master from Atri Sharma https://gitbox.apache.org/repos/asf?p=lucene-solr.git;h=15db6bf
LUCENE-8989: Allow IndexSearcher To Handle Rejected Execution (#899)
When executing queries using Executors, we should gracefully handle the case when Executor rejects a task and run the task on the caller thread
[Legacy Jira: ASF subversion and git services on Sep 27 2019]
Hi @atris,
CI complains with this commit:
ant test -Dtestcase=TestIndexSearcher -Dtests.seed=54AF47B9FD7B53A8 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=en-TC -Dtests.timezone=Arctic/Longyearbyen -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
Error:
07:39:50 [junit4] > Throwable #1: java.lang.Exception: Suite timeout exceeded (>= 7200000 msec).
07:39:50 [junit4] > at __randomizedtesting.SeedInfo.seed([54AF47B9FD7B53A8]:0)
07:39:50 [junit4] Completed [478/524 (1!)] on J1 in 7220.39s, 3 tests, 2 errors <<< FAILURES!
[Legacy Jira: Ignacio Vera (@iverase) on Sep 27 2019]
Interesting – Thanks for highlighting, I will take a look.
[Legacy Jira: Atri Sharma (@atris) on Sep 27 2019]
@ivera I have pushed a fix and beasted the relevant tests:
ant beast -Dbeast.iters=20 -Dtestcase=TestIndexSearcher -Dtests.seed=54AF47B9FD7B53A8 -Dtests.slow=true -Dtests.badapples=true -Dtests.locale=en-TC -Dtests.timezone=Arctic/Longyearbyen -Dtests.asserts=true -Dtests.file.encoding=US-ASCII
Could you check now, please?
[Legacy Jira: Atri Sharma (@atris) on Sep 27 2019]
I'd rather that we respect the policy in the ThreadPoolExecutor, which can itself can specify "run in caller" https://docs.oracle.com/javase/7/docs/api/java/util/concurrent/ThreadPoolExecutor.CallerRunsPolicy.html
If we always run in caller, it doesn't allow the caller to specify an alternative policy. Run in caller is typically the default, so if some explicit rejection policy has been provided, we're just frustrating the intention with this.
[Legacy Jira: Michael Sokolov (@msokolov) on Sep 28 2019]
If the caller has specified a different policy – we should not be even getting here. IndexSearcher#search API does not advertise what happens if the underlying Executor does not have enough capacity, nor does it consider the size of the executor during allocation of leaves. Both of the above facts lead to an implicit API contract where we are expected to gracefully handle the case of unavailable capacity. In case the user wishes to handle the exception themselves, they are free to override the default methods – similar to what we advice for other methods in IndexSearcher.
[Legacy Jira: Atri Sharma (@atris) on Sep 28 2019]
If the caller has specified a different policy – we should not be even getting here
Yes, I see that's fair. I suppose the only thing we are preventing here is having the exception bubble up, terminating the search. But why would we seek to prevent that here when the caller already has a mechanism for indicating the caller-runs policy? Suppose the executor has been shut down and is rejecting all requests for new work. In that case an Exception here would help the caller to understand they are doing something unusual instead of burying the error case and attempting to handle it.
[Legacy Jira: Michael Sokolov (@msokolov) on Sep 28 2019]
Suppose the executor has been shut down and is rejecting all requests for new work. In that case an Exception here would help the caller to understand they are doing something unusual instead of burying the error case and attempting to handle it.
Hmm, that is a good point – maybe we should not handle this exception and let it throw?
[Legacy Jira: Michael McCandless (@mikemccand) on Oct 02 2019]
Hmm, that is a good point – maybe we should not handle this exception and let it throw?
I would still argue that the default case should be to handle the exception and let the query execute gracefully. The case where executor has been erratically shut down is a good point, but for edges like this, we should probably override IndexSearcher's default behaviour. If it helps, we could potentially introduce a toggle which is enabled by default and indicates that IndexSearcher will perform the graceful handling, and if the user wishes the error to propagate, just call a new API method to disable the toggle?
[Legacy Jira: Atri Sharma (@atris) on Oct 03 2019]
As discussed in https://github.com/apache/lucene-solr/pull/815, IndexSearcher should handle the case when the executor rejects the execution of a task (unavailability of threads?).
Legacy Jira details
LUCENE-8989 by Atri Sharma (@atris) on Sep 25 2019, updated Oct 03 2019