failsafe-lib / failsafe

Fault tolerance and resilience patterns for the JVM
https://failsafe.dev
Apache License 2.0
4.16k stars 295 forks source link

Timeout with retry hangs with Failsafe 2.4.4 #348

Closed darkjh closed 1 year ago

darkjh commented 1 year ago

I'm having a problem with Failsafe 2.4.4 when composing retry and timeout policy like Failsafe.with(retryPolicy, timeoutPolicy).getStageAsync(...). This snippet should reproduce the issue https://gist.github.com/darkjh/fd84c945474dd163a342017e40a9e375

Basically the issue is that when having multiple attempts, the first attempt got timed out correctly, but subsequent retries are scheduled but never started and no progress would be made.

I also tested the same snippet in Failsafe 3.2.4 and everything works, each retry attempts are timed out and the future completes. I'm aware of different past discussion around timeout policy but even with 2.4.4 version I would expect the retries get executed and completed as timeout but not hanging.

jhalterman commented 1 year ago

I'm not sure when I'll get to look into this, or if it will be possible to fix without a lot of the other changes that went into 3.x. For now, I would assume that there won't be another 2.x release, and that upgrading to 3.x would be recommended.