Closed ponyisi closed 1 month ago
If fallback/retry works as expected, it retries 5 times and should wait for ~2mins (10 + 2*10 + 30 + 30 + 30). And we check transform status every 5sec.
BTW I haven't seen this error before. Could you let me know what is the version of your aiohttp
package?
Hi @kyungeonchoi the version is 3.10.5.
I suspect the issue here is that the connection succeeds, but the payload that is returned is bad (due to some weird proxy issue or something), and so the retry logic doesn't kick in (it is not a timeout or a 5xx error).
I had to add retry's like this to get things working for the 200 Gbps.
I think eveyrthing but the transform submission needs retries. :-) The reason not to do submission is that when I did, and the server didn't reply, the transform was still being submitted, so I ended up making things worse by just re-submitting. :-)
@ponyisi - I've created a PR #469 to fix this issue. Please let me know if it looks good to you. I will add tests then.
Retries added
When testing against the AGC I see frequent errors of the form
The client needs protection against these responses (which I believe are transient ... ?) A request should not die on the client side simply because one response failed. (That said I think there are some real problems with the HTTP proxies for connections to the SSL k8s ...)
Marking as a 3.0 thing because this really shows up on almost all my attempted submissions.