failsafe-lib / failsafe

Fault tolerance and resilience patterns for the JVM
https://failsafe.dev
Apache License 2.0
4.2k stars 297 forks source link

Difference between "Timeout of Failsafe" and "Timeout of CompletableFuture over JDK9" #258

Closed woolamkang closed 3 years ago

woolamkang commented 4 years ago

Hello. I want some method to end when the execution time reaches 3 seconds. It can be done with Timeout of CompletableFuture over JDK9 as following code.

        log.info("START");

        CompletableFuture<String> future = CompletableFuture.supplyAsync(() -> {
            for (int i = 1; i <= 1E10; i++) {
                ;
            }
            return "OK";
        }).orTimeout(3, TimeUnit.SECONDS).whenComplete((result, error) -> {
            if (error == null)
                log.info("success");
            else
                log.info("failure");
        });

        String content;
        try {
            content = future.get();
            log.info("Result >> " + content);
        } catch (InterruptedException | ExecutionException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        log.info("END");

It fails after exactly 3 seconds.

But, because I use JDK8, I tried to do same thing with Failsafe. However, when I run following code(with Failsafe), it doesn't fail and take long and long time.

        log.info("START");

        Timeout<Object> timeout = Timeout.of(Duration.ofSeconds(3)).withCancel(true).onSuccess(e -> log.info("success"))
                .onFailure(e -> log.info("failure"));

        CompletableFuture<String> future = Failsafe.with(timeout).getAsync(() -> {
            for (int i = 1; i <= 1E10; i++) {
                ;
            }
            return "OK";
        });

        String content;
        try {
            content = future.get();
            log.info("Result >> " + content);
        } catch (InterruptedException | ExecutionException e) {
            // TODO Auto-generated catch block
            e.printStackTrace();
        }

        log.info("END");

Why are the behavior of the two codes different? How can I make it work the same?

Tembrel commented 4 years ago

If you just want the orTimeout behavior in Java 8, you can create static utility methods based on the Java 9 code. Use this gist as a starting point. (I make no representation about its correctness, but it's adapted very closely from the Java 9 source and I've used it in production code.)

Your for (int i = 0; i <= 1E10; i++) loop is effectively an infinite loop, since there is no int value greater than 1E10. If you use long instead of int, the loop terminates quickly, because the compiler easily recognizes it as a no-op. Infinite spin loops are not a good way to simulate actual work. Real work that takes on the order of seconds typically involves I/O. A simple way to simulate work is to sleep, as if the thread were blocked while performing I/O.

I rewrote your examples as a JUnit test class, using sleep instead of a spin loop. Here's a typical output:

13:24:29,447  INFO Issue258:22 - CF START
13:24:32,460  INFO Issue258:31 - CF failure
13:24:32,460  INFO Issue258:40 - CF END
13:24:32,464  INFO Issue258:57 - FSNI START
13:24:39,455  INFO Issue258:81 - CF finished sleeping
13:24:42,482  INFO Issue258:81 - FSNI finished sleeping
13:24:42,483  INFO Issue258:62 - FSNI failure
13:24:42,483  INFO Issue258:73 - FSNI END
13:24:42,484  INFO Issue258:57 - FSI START
13:24:45,486  INFO Issue258:84 - FSI interrupted
13:24:45,486  INFO Issue258:62 - FSI failure
13:24:45,487  INFO Issue258:73 - FSI END

Note that the timeout of the CompletableFuture does not interrupt the work, which finishes 10 seconds after it starts, long after the 3 second timeout fires.

The FSNI version (Failsafe Timeout withCancel(false), meaning no interruption) behaves similarly, except that it waits the full 10 seconds before throwing its TimeoutExceededException.

The FSI version (Failsafe Timeout withCancel(true), meaning the work is interrupted when cancelled) stops the work after 3 seconds.

The reason Failsafe doesn't do what CompletableFuture.orTimeout does, returning quickly after timeout but leaving a thread occupied, potentially indefinitely, is that in general we assume that Failsafe is being used in an environment where multiple retries are the norm, and tying up an indefinite number of threads for indefinitely long is not acceptable.

The burden is on the user of Failsafe Timeout to write tasks that are responsive to cancellation, by using a ContextualSupplier and checking the cancellation status of the context at key points and/or using withCancel(true) and doing sensible things at interruption points.

For a discussion on the use of Timeout in conjunction with RetryPolicy, see my comment on a different issue.