Open frank-montyne opened 1 year ago
@swallez any ideas on this one?
+1
flush()
should wait for the listeners as well, as it was the case with the previous implementation of the BulkProcessor.
I have a scenario where I need to use the BulkIngester to handle retries, metrics, etc., but I want to have control over when the request is sent so that I can rollback the distributed operation in case of failure. My idea was to just set the maximum size/number of requests to Long.MAX_VALUE
and call flush()
when certain conditions were met relevant to my use case.
The problem is that there is no way to know if the operation succeeded or not, since the listener's afterBulk
calls happen after things are considered "complete", differently to the way the BulkProcessor worked in previous versions.
...
if (exec != null) {
// A request was actually sent
exec.futureResponse.handle((resp, thr) -> {
sendRequestCondition.signalIfReadyAfter(() -> {
requestsInFlightCount--;
closeCondition.signalAllIfReady();
});
// Problem between the above ^ and below \/
if (resp != null) {
// Success
if (listener != null) {
listener.afterBulk(exec.id, exec.request, exec.contexts, resp);
}
...
I have no way to synchronise the call on my scenario, where I want to throw an exception synchronously when flush()
has any errors in the response. If I try to check the number of requestsInFlight/Pending, there may be a race condition where it gets to zero before the callbacks are called.
If we could return the exec.id
on flush()
, we could wait until it was sent to afterBulk
and sync things up, but it would still be an ugly workaround that wasn't required in the BulkProcessor.
This might be fixed with #867
Java API client version
7.17.9 and 7.17.10-SNAPSHOT
Java version
java 19
Elasticsearch Version
7.17.9
Problem description
When using the BulkIngester with a BulkListener then calling the close() on the BulkIngester returns before all afterBulk() BulkListener callbacks are finished. Below is a snippet of code that uses the BulkIngester. If you simply add a few logging statements after the close() and in the afterBulk() you will see that the afterBulk() callbacks are still busy after the close() returns. That should not be the case. On return of the close() call the bulk should have been handled completely.
Thanks for looking into the problem.
``