Open lennijusten opened 3 days ago
The thing that controls whether we backoff and retry API errors is this function:
@override
def is_rate_limit(self, ex: BaseException) -> bool:
return isinstance(
ex,
TooManyRequests | InternalServerError | ServiceUnavailable | GatewayTimeout,
)
You could play with this to see if there is another exception type that would pickup this error.
You can also use --max-connections
to throttle down the number of active connections.
I'm trying to run Gemini 1.5 pro on various evals and keep encountering an
Internal Server Error (500)
. I opened an issue about this in the google-gemini/generative-ai-python repo and best I can tell this seems to be caused by too many incoming API requests (full trackback and environment details in attached there)I'm unsure whether it's in the purview of Inspect to handle this, but I wanted to flag the issue.