Open rafal-dudek opened 1 month ago
Hi @rafal-dudek just to clarify, are you looking for a retry solution with support for:
@ddobrin
We would just like to have feature similar to different models in Spring-AI e.g. OpenAI Chat with retry properties: https://docs.spring.io/spring-ai/reference/api/chat/openai-chat.html#_retry_properties
Currently we implemented retries on top of the VertexAiGeminiChatModel invocation, but it would be nice to have it implemented in the library.
For now, we do not use provisioned throughput mode, so it is not needed by us, but of course it is nice feature that could be available.
I see merged PR: https://github.com/spring-projects/spring-ai/pull/1437. So, looks like it should be working now with v1.0.0-M3, but it is not described in Gemini the documentation: https://docs.spring.io/spring-ai/reference/api/chat/vertexai-gemini-chat.html
Expected Behavior
VertexAiGeminiChatModel should use retry options similar to e.g. OpenAiChatModel.
Current Behavior
VertexAiGeminiChatModel does not use retry.
Context
Gemini model 1.5 Pro sometimes returns error:
Retrying in such cases is crucial for stable application operation. More information of resources exhaustion: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas#troubleshoot-dynamic-shared-quota
There is already an issue to add spring-ai-retry dependency https://github.com/spring-projects/spring-ai/issues/832, but just adding dependency does not solve the problem with not using retries by VertexAiGeminiChatModel.