Closed chrisplim closed 2 years ago
Thanks for the report, I think I know what the problem may be. I'll work on getting this fixed.
@matthewshafer Were you able to find out if the scenarios that I presented are valid, that there is a problem, and what the root cause is? What would the time estimate on a fix be?
@matthewshafer Were you able to find out if the scenarios that I presented are valid, that there is a problem, and what the root cause is? What would the time estimate on a fix be?
The scenario's are valid. The issue is when running parallel requests faraday's methods return lambda's that get executed later. This means the circuit "finishes" executing before any request happens. It's unfortunate that the test suite doesn't test the parallel case where the circuit flips.
A quick workaround would be to not utilize parallel requests in faraday.
There are quite a bit of changes that are needed to support parallel requests. The middleware essentially needs a way to manually control the circuit, like #141. I've attempted to fix the issue by using the existing api but there's issues that could arise in specific situations (circuit counting the request twice).
I don't have a time estimate because there's still investigation that needs to be done with the faraday api's.
Thanks @chrisplim for reporting this issue. I'm also facing this issue; and I've been trying to find an answer. I just double checked the code... it's using in_parallel
to make parallel requests.
@matthewshafer thanks for looking into this. I'm interested in trying out the fix once you have it available. I'm currently working on a PoC with circuitbox gem in my project.
@matthewshafer You mentioned that a quick workaround would be to not utilize parallel requests in faraday. I used only typhoeus in my example below. The below implementation only breaks the circuit after a full round of parallel requests. I would guess this is expected. Is there a better way, or Is the below implementation similar to what you were thinking? https://gist.github.com/chrisplim/e11eb94c640d69cc3256816462888da6
@matthewshafer You mentioned that a quick workaround would be to not utilize parallel requests in faraday. I used only typhoeus in my example below. The below implementation only breaks the circuit after a full round of parallel requests. I would guess this is expected. Is there a better way, or Is the below implementation similar to what you were thinking?
https://gist.github.com/chrisplim/e11eb94c640d69cc3256816462888da6
What I was thinking is a little different because we need a way to not run the requests if the circuit is open.
The part that would be needed in the gist is to not queue a request in hydra if the circuit is open. While it's easy to check if a circuit is open that doesn't run any notifications/logging.
@matthewshafer Has there been any progress on this fix?
After looking to see if the circuitbox faraday middleware could support faraday's parallel requests I've found that this won't work well with the current design of the middleware. With the middleware we are able return our own response object when things fail. In the parallel mode we are unable to return our own response object when the circuit is closed but the request fails. This is because parallel mode needs to assign a response object before the requests are made.
I also don't know the extent of changes needed to circuitbox internals to support parallel requests. While I was able to get circuitbox tracking parallel requests it required some internal changes. To sum up the changes, the state of the circuit needs to be checked before the request moves on to the next middleware so we can stop it from running if the circuit is open. If the circuit is closed then we let it continue but need to send the success/failed state after the request is run. This is quite a bit more complicated compared to the non parallel flow where we can wrap the entire call in a circuitbox run. Additional work would be needed because the timing of the circuit run wouldn't reflect the actual time of the request.
I'm going to add to the readme that the faraday middleware does not work well with parallel requests and I'll clean up the tests in circuitbox that make it seem like it might work with it.
When I defined an open_circuit lambda to capture a Typheous request timeout, the open_circuit code block is evaluated as I would expect, but a RequestFailed exception is raised. I would have expected the exception to be caught and that failed request to count towards the error rate threshold for an open circuit.
I added two tests on my fork to demonstrate the scenarios that I am trying to get working or get some clarity on. https://github.com/chrisplim/circuitbox/blob/f1c6e754ca8678471d4d60040aaa29553be13d26/test/integration/faraday_middleware_test.rb#L70-L107