FirebaseExtended / codelab-ai-genkit-rag

Apache License 2.0
15 stars 25 forks source link

switch model to 1.0-pro #15

Open adonisote opened 4 months ago

adonisote commented 4 months ago

Fixes #

Step 8. https://firebase.google.com/codelabs/ai-genkit-rag#7

It fails with the following error: ⨯ Error: Vertex response generation failed: ClientError: [VertexAI.ClientError]: got status: 429 Too Many Requests. {"error":{"code":429,"message":"Quota exceeded for aiplatform.googleapis.com/generate_content_requests_per_minute_per_project_per_base_model with base model: gemini-1.5-flash. Please submit a quota increase request. https://cloud.google.com/vertex-ai/docs/generative-ai/quotas-genai.","status":"RESOURCE_EXHAUSTED"}} at Generator.throw () digest: "764727645" POST /gemini 500 in 2669ms

When moving to gemini-1.0-pro-previw it passes. Flash creates the same issue on my side. I guess because of: https://cloud.google.com/vertex-ai/generative-ai/docs/quotas

alexastrum commented 3 months ago

Hi @adonisote, given that you're hitting the quota limit for 1.5 Pro, I'd suggest trying the Gemini 1.5 Flash.

1.0 Pro could also be a good choice, but given that it is an older model, I would consider the 1.5 family first.

@nohe427, I wonder if it's worthwhile to look into switching to the 1.5 Flash model in this codelab.

nohe427 commented 3 months ago

@alexastrum & @adonisote --> I would like the 1.5 Flash model instead. Could you please replace the model line with: vertexai/gemini-1.5-flash-preview since this model has a higher quota and is cheaper to run?