failsafe-lib / failsafe

Fault tolerance and resilience patterns for the JVM
https://failsafe.dev
Apache License 2.0
4.16k stars 295 forks source link

Result futures strongly retain supplying functions #361

Closed stevenschlansker closed 1 year ago

stevenschlansker commented 1 year ago

Hi, we use Failsafe to retry network operations that might fail. We had a minor production outage last night where a JVM ran out of heap. Upon inspection of the heap dump, we were surprised to realize that this pseudo-code retains much more memory than you expect:

var retryExec = Failsafe.with(RetryPolicy.builder()
                    .abortOn(InterruptedException.class)
                    .onFailedAttempt(e -> LOG.warn("Index action failed #{}", e.getExecutionCount(), e.getLastException()))
                    .withMaxAttempts(10)
                    .build())
                .with(outerExecutor);
var largeIntermediateState = load10KBOfData();
var key = "something";
var resultFuture = retryExec.getAsync(() -> callRemoteService(key, largeIntermediateState));
static var cacheMap = new ConcurrentHashMap<Key, CompletableFuture<Result>>();
cacheMap.put(key, resultFuture);

If the cacheMap is long lived, it seems that the original supplying function is retained indefinitely even once the result is available.

FailsafeFuture -> AsyncExecutionImpl -> RetryPolicyExecutor.apply$Lambda -> failsafe.Functions$Lambda -> OurSupplier

While the Future is executing, it makes perfect sense to retain the supplying function. But, after all retries and policies are exhausted, is that still necessary?

It seems that this could be improved with a small change: in AsyncExecutionImpl.complete, set outerFn = null at the very end. In SyncExecutionImpl.executeSync, set outerFn = null right after applying it. Then, the supplying function could be reclaimed but the result or exception would still be available in the Future.

If this cannot be improved, it is probably good to make a documentation note.