This is to avoid the following error with long context on Gemini models due to insufficient quota:
Error: 429 Resource exhausted. Please try again later. Please refer to https://cloud.google.com/vertex-ai/generative-ai/docs/quotas#error-code-429 for more details.
This approach uses exponential backoff retries when encountering a ResourceExhausted error.
This is to avoid the following error with long context on Gemini models due to insufficient quota:
This approach uses exponential backoff retries when encountering a ResourceExhausted error.