Why recommend CircuitBreaker before Retry?

failsafe-lib / failsafe

Fault tolerance and resilience patterns for the JVM

Apache License 2.0

4.2k stars 297 forks source link

https://failsafe.dev/policies/#composition-recommendations Recommends circuit breaker inside retry rather than outside retry. I'd love to see more elaboration as to why?

I would think the reverse would be better, however Failsafe is excellent so I assume you have good reasons. My thinking:

Transient errors (short disruptions, service mesh sent request to bad backend, ...) can be mitigated by retry and should not determine if the request should be made or fallback applied. The circuit breaker should trip only when the transient issue is unrecoverable.

Retries outside the circuit breaker would result in higher error counts as retries (more likely to fail during brief outages) would accrue to the threshold. It's easier to think about CB when the counts are upstream (consumer side) requests.

failsafe-lib / failsafe

Why recommend CircuitBreaker before Retry? #377