DougMidgley / SFMC-SDK

Other
16 stars 5 forks source link

Retry on http status 429, 500, 503? #132

Open dobesv opened 2 years ago

dobesv commented 2 years ago

It might be nice if the library would retry on http status 429, 500, 503 e.g. using axios-retry. If the Retry-After header is provided, it should honor that.

e.g. see https://developer.salesforce.com/docs/marketing/marketing-cloud/guide/rate-limiting-best-practices.html

Retry with exponential backoff

Because you can encounter transient errors or timeouts, your API client probably contains logic to handle failures. If that logic is to continuously retry on failure, however, you can be rate-limited. Instead of continuous retry on failures, consider using exponential backoff for retries with web services or any cloud-based services.

Exponential backoff exponentially increases the wait time between subsequent retries. For example, the first retry happens after 2 seconds, the second retry happens 4 seconds after the first retry, the third retry happens 8 seconds after second retry, and so on, perhaps until a maximum retry length is reached. The exact logic depends on the language and pattern that you use.

Honor the HTTP 429 error code

To find out if you are rate-limited, look for the HTTP 429 Too Many Requests response status code. This response status code indicates that you sent too many requests in a given amount of time and need to make adjustments.

If you receive the 429 error code, you can use exponential backoff logic. As part of the error response, the Retry-After response HTTP header indicates how long the user agent must wait before making a follow-up request:

Retry-After: \<delay-seconds>

Delay-seconds is a non-negative decimal integer that indicates how many seconds to delay after the response is received.

JoernBerkefeld commented 1 year ago

good idea @dobesv - wanna work on that yourself maybe? :)

DougMidgley commented 1 year ago

@dobesv definitely something we can add. I would be avoiding additional dependencies, esp on axios, as I have plans to move to node native fetch and moving to modules.

JoernBerkefeld commented 1 year ago

Couldn't replicate this so far. @dobesv do you have an activity im mind that triggers this regularly for you?

dobesv commented 1 year ago

an activity im mind that triggers this regularly for you?

I don't have any great ideas how to test it. You would have to send a high volume of requests to the API until it rate limits you, to get this in a live environment, which could be a bit of a pain. Maybe there's a way to simulate a response using some kind of mock adapter or something?

DougMidgley commented 1 year ago

Hey @dobesv, Yea i tried 20k requests over 5 min and still didnt git anything using soap or rest. Given Marketing Cloud's ability to document one thing and implement another or implement differently depending on the request, unless we can get a real response, I wouldn't be a fan of adding this in. We do have mocks in the tests we run, so once we have a concrete example, we could then add that in, but that example is currently missing from the wild it seems :( have you ever seen it?

dobesv commented 1 year ago

Sure, if it's a pain to add feel free to punt on it. I don't recall actually getting throttled I just thought it would be nice to put measures in place "just in case". But I wouldn't spend a ton of time writing your own custom retry system, I thought maybe there would be a library you could use that would handle this easily.

dobesv commented 1 year ago

If this starts causing us some pain I can try to open a PR later, or maybe someone else will step in and fix it.

For anyone who does try and take it on, if you use fetch there's a library here that could help make it a few lines of code: https://www.npmjs.com/package/fetch-retry

dobesv commented 1 year ago

Another way to handle this that could have some beneficial side-effects is to allow users of the library to provide their own axios instance. In addition to allowing us to add our own retry logic, it would also allow us to provide a mock axios instance for our own tests, add proxy options, stuff like that.

DougMidgley commented 1 year ago

@dobesv indeed, without being certain of people getting it I would leave it for now. As I mentioned, the docs often don't align 1:1 with reality so wouldn't want to overdesign it and then have it not work. The plan is to move to native fetch in v2 since node now supports it natively, so we could look at implementing a retry lib in that case :)