Closed christianapel closed 9 years ago
This issue was introduced with pull request https://github.com/Comcast/jrugged/pull/37
So I fixed this by removing the or state in the check, which i think was wrong. I additionally removed a unit test that I think was actually a bunk case. I am going to push the change here in a sec if you want to take a look - I think I got the same thing with a much smaller change.
This has not fixed the actual issue. The problem just appears if the getServiceStatus() method is called after the breaker has tripped. In HALF_CLOSED state, the first line of this method calls allowRequest(), which then calls canAttempt(). This method always updates isAttemptLive to true, even if no request was performed against the wrapped service.
In this state, it's not possible to perform any further probe requests and therefore never changing back the breaker state to CLOSED.
Please copy the unit test of this pull request, it's still failing on your latest master revision.
Ahh - ok, I agree. I will merge this and get at least one additional change put up because the test needed some small timing tweaks to prevent it from failing in the travis build environment.
After the reset timeout, a getStatus() call should change the circuit breaker state to DEGRADED, but not set the isAttemptLive flag to true. This would lead to a HALF_CLOSED state which doesn't allow to execute any further attempts and therefore block all subsequent successful calls.