failsafe-lib / failsafe

Fault tolerance and resilience patterns for the JVM
https://failsafe.dev
Apache License 2.0
4.17k stars 296 forks source link

RetryPolicy total time is 30s when use getAsync? #294

Closed Succy closed 2 years ago

Succy commented 3 years ago

Hi dear @jhalterman, I found 2 problems when I try to use failsafe version 2.4.3,the demo code is follows

public static void main(String[] args) {
    final RetryPolicy<Object> retryPolicy = new RetryPolicy<>()
            .handle(Exception.class)
            .handleResultIf(Boolean.FALSE::equals)
            .withBackoff(1, 60, ChronoUnit.SECONDS)
            .withMaxRetries(10);
    ThreadUtil.execute(() -> {
        log.info("---> this is normal execution");
        Failsafe.with(retryPolicy).getAsync((CheckedSupplier<Object>) () -> {
            // I want to generated result randomly
            boolean result = RandomUtil.randomBoolean();
            log.info("this is retry...");
           // let result always false and trigger retry mechanism
            result = false;
            return result;
        });
    });
}

2 problems follows:

2021-09-09 19:58:21 INFO  [pool-1-thread-1] c.d.e.t.RetryTest.lambda$main$1(37) - ---> this is normal execution
2021-09-09 19:58:22 INFO  [rkJoinPool.commonPool-worker-1] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
2021-09-09 19:58:23 INFO  [rkJoinPool.commonPool-worker-1] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
2021-09-09 19:58:25 INFO  [rkJoinPool.commonPool-worker-1] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
2021-09-09 19:58:29 INFO  [rkJoinPool.commonPool-worker-1] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
2021-09-09 19:58:37 INFO  [rkJoinPool.commonPool-worker-1] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
2021-09-09 19:58:53 INFO  [rkJoinPool.commonPool-worker-2] c.d.e.t.RetryTest.lambda$null$0(41) - this is retry...
jhalterman commented 3 years ago

when I use getAsync method, I set max retries, but stop less than the number of max retries and I found the total retry time does not exceed 30 seconds.

I wonder if there's something else going on with the available threads on your system that is causing execution to stop after a few attempts? The RetryPolicy you described should run with 10 retries. Here's an example from my machine run against 2.4.3:

RetryPolicy<Object> retryPolicy = new RetryPolicy<>()
  .handleResult(false)
  .withBackoff(1, 60, ChronoUnit.SECONDS)
  .withMaxRetries(10);
Failsafe.with(retryPolicy).getAsync(ctx -> {
  if (!ctx.isFirstAttempt())
    System.out.printf("[%s] retrying after %s seconds%n", Thread.currentThread().getName(),
      ctx.getElapsedTime().getSeconds());
  return false;
}).get();

Output:

[ForkJoinPool.commonPool-worker-19] retrying after 1 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 3 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 7 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 15 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 31 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 63 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 123 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 183 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 243 seconds
[ForkJoinPool.commonPool-worker-19] retrying after 303 seconds

the last time retry execution thread of forkjoin pool always different from another thread, below is the console output eg:rkJoinPool.commonPool-worker-2, why is it like this?

Which thread gets used is pretty random, based on whatever is available in the common pool. Perhaps worker-1 was busy doing something else towards the end of the test run.

Succy commented 3 years ago

@jhalterman thanks for your reply! I have tried the example code you gave, and run with 10 retries. Perhaps something wrong on my code. I want to simulate a multi-threads environment for retries, So I create a new Thread to wrapper the retrypolicy code and did not call get() method after getAsync, Is there a problem with this approach? Is there such an example for reference like HTTP request :) because I want to write a message forwarding middleware. Anyway thank you very much.

jhalterman commented 3 years ago

getAsync will return a CompletableFuture, then you can use a method like whenComplete or get to obtain a result of the execution once it's complete. For tests like my example above I just call get to make sure the test blocks and waits for the execution to complete, otherwise the test might return instantly and you would see no output.

I don't have any Http request examples handy since it might depend what library you're using, but it should be pretty easy to wrap any request/connection code with Failsafe if needed.

jhalterman commented 2 years ago

Closing for now. Feel free to reopen if there is still a question/issue.