Open yan-hic opened 2 years ago
Hi @yiga2,
RESOURCE_EXHAUSTED
is not an error code the client library will retry automatically. The rationale is that immediately retrying a call when quota is exhausted isn't guaranteed to succeed and retrying might still cause billing charges. https://google.aip.dev/194 describes automatic retry configurations for our APIs.
The only error considered safe to retry across the board is UNAVAILABLE
indicating a transient connection error. Document AI also includes DEADLINE_EXCEEDED
in its retryable error codes for a few methods (see configuration).
Is it possible the call is retried too quickly and fails as the quota hasn't yet been reset? You could try tweaking the initial
and multiplier
arguments on the Retry
object to ensure enough time has passed for the requests per minute quota to reset. The default retry object has an initial delay of 1 second and a multiplier of 2.
I understand the error is not retried automatically but the code snippet above makes it retry, and I checked and DocAI does not bill for incomplete calls.
@busunkim96 You raised a good point on the quota reset as I misread the limit - the quota is per min, not per sec (which is more common). As the calls originate from cloud functions, I don't control the rate but can definitely tweak the settings.
Looking at the initial call in the screenshot, the deadline must be culprit as +3 min passed - I will extend but it would be good if the retry library could log when deadline is reached (and only that event, not every retry).
We recently added retry support for RESOURCE_EXHAUSTED
in google-cloud-bigquery-storage
. https://github.com/googleapis/python-bigquery-storage/pull/366
It's more subtle than other retries we've done so far, as the response message can contain a suggested sleep time until the next request (at least in the case of BigQuery, not sure if that's a general Google API thing)
We are using a cloud function that, among others, calls
DocumentAI
. Assuming the latter has implemented the Retry dependency correctly, we have the following code that should retry when hitting a (retriable) quota limit:However, the call is not retried as cloud logging shows:![image](https://user-images.githubusercontent.com/25224265/151846319-6c231c41-af2f-4ba6-8b8c-5608db5ce8f9.png)
Any clue ?