Open yoshuawuyts opened 2 years ago
This should be pretty straight forward to fix or look at. I think the main question is how it should interact with the other settings of the policy. i.e. what if it is longer than max_delay
or max_elapsed
? I would think you would just take the min?
It might also need to be added to the trait, because the function with access to the headers isn't what picks the duration between retries.
Do we know if Azure is actually including this header? I have never seen it, but we also usually get ServerBusy | Service Unavailable (503)
. I also don't see 429 in the listed common or blob store error codes. Maybe they are used by some of the other services?
You can see which services are using 429 by looking at the specs: https://cs.github.com/?scopeName=All+repos&scope=&q=%22429%22%3A+repo%3AAzure%2Fazure-rest-api-specs++language%3AJSON
And since @cataggar pointed out this is documented in code, the header is indeed actually used sometimes: https://cs.github.com/Azure/azure-rest-api-specs?q=%2F%22429%22%3A.*%5Cn.*headers.*%5Cn.*Retry%2F+repo%3AAzure%2Fazure-rest-api-specs++language%3AJSON
So back to the question of what do you do when it conflicts with the policy.
It seems like the specific delay between retries should be the max of the Retry-After
time and the policies specified max_delay
. Essentially max_delay
is a best attempt unless the service asks for more delay. And once the max_elapsed
is reached even if the delay ended up being longer than max_delay
we should no longer retry.
As per the MDN page on HTTP Status Code 429, when a
429
status is sent it may include aRetry-After
HTTP header which specifies how long to wait before trying again. Currently ourRetryPolicy
does not check for this header, and we should make it so it does.