Azure / azure-openai-benchmark

Azure OpenAI benchmarking tool
MIT License
130 stars 54 forks source link

RETRY_AFTER_MS_HEADER not getting properly "honored" #33

Open sushaanttb opened 9 months ago

sushaanttb commented 9 months ago

Considering following code,

image

In the context of this code, if the RETRY_AFTER_MS_HEADER is found in the response headers, the function will sleep for that much time but then the function will resume from the point where asyncio.sleep was called, not from the start of the function or the start of the while loop.

Thus, in cases when the RETRY_AFTER_MS_HEADER duration would be greater than MAX_RETRY_SECONDS, which is very much possible, considering how much less the value of MAX_RETRY_SECONDS (5s) currently is, the condition of the while loop will be false after waking from the thread sleep as the while loop is considering the stale time value capture from the 1st incident of throttling, hence the while loop will not continue, effectively meaning the request will not be retried.

Suggestions:

  1. The time check logic in while loop should not be based on the stale value and should update considering the time difference of thread sleep from RETRY_AFTER_MS_HEADER sleep condition.
  2. Value for MAX_RETRY_SECONDS should not be hardcoded and be made configurable and decided by user. See #36
  3. Documenting about MAX_RETRY_SECONDS and letting the users know how it works. See #37
michaeltremeer commented 9 months ago

Hey @sushaanttb,

Have a look at this PR and the discussion within. You will see some more info there about what is going on with different retry settings, and how MAX_RETRY_SECONDS effects the outputs. That PR has not been merged and it doesn't look like it will, so if you want the behaviour you suggest, I would suggest you use my fork which fixes this issue and uses a more suitable value of 60 seconds for MAX_RETRY_SECONDS.