Open sidagarwal04 opened 1 month ago
Hello - not sure if this will help you, but I think it is the cause of my 429 errors - Is it the 32,000 tokens per minute limit that is breaking your request? I'm getting 429 errors on every prompt over 32,000 tokens sent via the API for Gemini-1.5-Pro. I think it must be an error....otherwise the 1,000,000 context window becomes a bit irrelevant!...
I don't think I am sending that many tokens per minute but let me check again if that's the case.
Problem solved? I also had the same problem
@paopao0226 Not yet. Still struggling with this. :(
I also have the same issue on gemini-1.0-pro
Same problem bruh, I want to give the AI system prompt that is very long as context for it, I was wondering like no way 1M tokens limit can exhaust resources already when what I give it is barely 40-50K tokens at most. This 30K token limits is stupid.
Perhaps the 1 million context only works with documents that are uploaded?
Which model are you using @ReEnMikki and @Joseph-Cardwell
Which model are you using @ReEnMikki and @Joseph-Cardwell
I was using gemini-1.5-pro
I think @Benniepie 's image is correct, when I used gemini-1.5-flash I can upload tons of texts in one go with no problem. The only problem is that this model is more stupid than gemini-1.5-pro, which is already more stupid than GPT-4o
😂 not so openAI is clear in performance sadly. But I think the reason for your error is because you interacted with the model pass the number of times you are allowed to in a minute. For gemini1.5pro I think it’s 2x per minute.
On Mon, 24 Jun 2024 at 13:56, ReEnMikki @.***> wrote:
Which model are you using @ReEnMikki https://github.com/ReEnMikki and @Joseph-Cardwell https://github.com/Joseph-Cardwell
I was using gemini-1.5-pro
I think @Benniepie https://github.com/Benniepie 's image is correct, when I used gemini-1.5-flash I can upload tons of texts in one go with no problem. The only problem is that this model is more stupid than gemini-1.5-pro, which is already more stupid than GPT-4o
— Reply to this email directly, view it on GitHub https://github.com/langchain-ai/langchain/issues/22241#issuecomment-2186519335, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIB3UJENFJC4CP2IN4SWCLZJAJRTAVCNFSM6AAAAABINNFMC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBWGUYTSMZTGU . You are receiving this because you commented.Message ID: @.***>
32k tokens only per minute defeats the whole purpose of 1M tokens context window, now if I wanna give it a long set of instruction for system prompt, to serve as base context for it to read from to generate outputs, I have to call the API like dozens of times, equivalent to waiting dozens of minutes, bastard Google put this unnecessary limit to subtly force us to pay and suffocate the viability of free tier
What are you building, agents ?
On Mon, 24 Jun 2024 at 14:40, ReEnMikki @.***> wrote:
32k tokens only per minute defeats the whole purpose of 1M tokens context window, now if I wanna give it a long set of instruction for system prompt, to serve as base context for it to read from to generate outputs, I have to call the API like dozens of times, equivalent to waiting dozens of minutes, bastard Google put this unnecessary limit to subtly force us to pay and suffocate the viability of free tier
— Reply to this email directly, view it on GitHub https://github.com/langchain-ai/langchain/issues/22241#issuecomment-2186611686, or unsubscribe https://github.com/notifications/unsubscribe-auth/ACIB3ULMDHP5LMHZ4IB7ZYLZJAOWPAVCNFSM6AAAAABINNFMC2VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDCOBWGYYTCNRYGY . You are receiving this because you commented.Message ID: @.***>
Checked other resources
Example Code
Error Message and Stack Trace (if applicable)
===== Application Startup at 2024-05-28 17:43:47 =====
Caching examples at: '/home/user/app/gradio_cached_examples/14' Caching example 1/4
MATCH (m:Movie) WHERE m.released > 2000 RETURN m LIMIT 5
Full Context: [{'m': {'tagline': 'Free your mind', 'title': 'The Matrix Reloaded', 'released': 2003}}, {'m': {'tagline': 'Everything that has a beginning has an end', 'title': 'The Matrix Revolutions', 'released': 2003}}, {'m': {'title': "Something's Gotta Give", 'released': 2003}}, {'m': {'tagline': 'This Holiday Season… Believe', 'title': 'The Polar Express', 'released': 2004}}, {'m': {'tagline': "Based on the extraordinary true story of one man's fight for freedom", 'title': 'RescueDawn', 'released': 2006}}]
Full Context: Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
[{'p': {'born': 1960, 'name': 'Hugo Weaving'}, 'r': ({'born': 1960, 'name': 'Hugo Weaving'}, 'ACTED_IN', {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}), 'm': {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}}, {'p': {'born': 1956, 'name': 'Tom Hanks'}, 'r': ({'born': 1956, 'name': 'Tom Hanks'}, 'ACTED_IN', {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}), 'm': {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}}, {'p': {'born': 1966, 'name': 'Halle Berry'}, 'r': ({'born': 1966, 'name': 'Halle Berry'}, 'ACTED_IN', {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}), 'm': {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}}, {'p': {'born': 1949, 'name': 'Jim Broadbent'}, 'r': ({'born': 1949, 'name': 'Jim Broadbent'}, 'ACTED_IN', {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}), 'm': {'tagline': 'Everything is connected', 'title': 'Cloud Atlas', 'released': 2012}}]
Full Context: [{'p.name': 'Ron Howard'}]
Full Context: [{'p': {'born': 1960, 'name': 'Hugo Weaving'}}, {'p': {'born': 1981, 'name': 'Natalie Portman'}}, {'p': {'born': 1946, 'name': 'Stephen Rea'}}, {'p': {'born': 1940, 'name': 'John Hurt'}}, {'p': {'born': 1967, 'name': 'Ben Miles'}}] Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 4.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 8.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 16.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
Retrying langchain_google_genai.chat_models._chat_with_retry.._chat_with_retry in 32.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
To create a public link, set
share=True
inlaunch()
.Description
I am trying to use Google Gemini 1.5 Pro API key from Google AI Studio in the above code and getting the error:
Retrying langchain_google_genai.chat_models._chat_with_retry.<locals>._chat_with_retry in 2.0 seconds as it raised ResourceExhausted: 429 Resource has been exhausted (e.g. check quota)..
This doesn't seem right as the API call was made only twice. I tried switching to
gemini-1.5-flash
and it seems to work fine. I am assuming this has something to relate to gemini 1.5 pro's implementation with langchain. Quoting one of the replies in a "somewhat" similar issue:System Info