Closed dannyfreeman closed 2 years ago
Thanks for the input. This is complicated by the fact that the interpretation of SSLHandshakeException as fault
happens outside aws-api, in the cognitect/http-client
, and we're planning to support "bring your own http client" at some point in the future, which takes control over how exceptions are interpreted as anomalies further out of our hands.
You are using the right escape hatch in the way it is intended. At the very least, we should update the README to explain how this works and suggest using a custom retriable?
function when you run into scenarios like this.
Thanks for the response! The retriable?
workaround we have right now is a fine solution for us. I'm sure other people would appreciate having something about it in the README. The docstrings in the library were very helpful and pointed us in the right direction in that regard.
If you think it's worthwhile for me to keep chasing this down, is there a way I could raise this issue with the cognitect/http-client
repository? I have no idea where it is hosted.
Thanks for offering to help, but the cognitect/http-client
is not hosted in a public repo. It's open source in that you can look at the source, but it is not open for contribution.
Hey @dannyfreeman , I added a "retriable errors" section to the the README. I'm going to close this issue, but feel free to add comments here if you have any. We can always reopen it if it's useful.
@dchelimsky that extra info in the README looks great, thanks for updating it.
As an update, we started seeing this error when call other AWS APIs. Some specific ones are cloudwatch :PutMetricData
and SNS :PublishBatch
.
I think we may be seeing something related to this ticket here: https://github.com/cognitect-labs/aws-api/issues/127
Once we saw that it wasn't just isolated to SQS and the DeleteMessage
endpoint we've overridden the default retriable?
for all of our clients to check for this specific exception.
(defn- aws-ssl-ca-error?
[{:cognitect.anomalies/keys [category message] :as resp}]
(and (= category :cognitect.anomalies/fault)
(= message "Abruptly closed by peer")
(instance? javax.net.ssl.SSLHandshakeException
(:cognitect.http-client/throwable resp))))
(defn default-retriable?
[response]
(or (cognitect.aws.retry/default-retriable? response)
(aws-ssl-ca-error? response)))
Then we use our version of default-retriable?
when creating clients and use it instead of this library's default-retriable?
when we want to override it for specific operation. Hopefully this helps out anyone else that runs into the problem.
If I can find the time I will try to dive into the issue more and see if I can reliably reproduce it, but that has proven difficult so far.
@dchelimsky we are also running into this error when publishing to eventbridge
Dependencies
Description with failing test case
When calling the
DeleteMessage
api in one of our deployed services, we occasionally get an:cognitect.anomalies/fault
error. It does not happen every time, maybe once out of every couple hundred requests.This is the code, there is not much to it
After posting this issue in the #aws channel on the clojurians slack, and some advice from Ghadi, we started calling invoke with this
:retriable?
argumentThis seems to have solved our issue, but we don't really know what the root cause of this issue is. We've never seen it happen with other aws endpoints, just :DeleteMessage. If it's a common issue, it would be nice if the cognitect aws api could some categorize this type of exception as retriable.
Stack traces