Open okada1220 opened 2 months ago
Im experiencing the same issue even when using models/gemini-1.5-pro-001 and trying to cache roughly 300k tokens even though it has an input token limit of 2,097,152
@okada1220,
Thank you reporting this issue. This looks like an intermittent error and should work now. Automatic retry logic is added to SDK to avoid these errors and you can follow #502 FR for examples on retry logic. Thanks
@singhniraj08 Thank you for your response.
I checked again, and it seems that the same error is still occurring...
I looked at the retry logic example in #502, which seems to apply when using request_options
withgenerate_content
. But since I’m using genai.caching.CachedContent.create
, which doesn’t have request_options
, I’m wondering if this retry logic is still applicable here. Do you think this approach will work in my case?
I'm receiving this error too
Description of the bug:
I'm trying to create a cache by reading the contents of multiple PDF files, but when the total number of tokens within the files exceeds approximately 500,000 tokens, I receive a 503 error (Service Unavailable) from Google API Core.
It seems that the error isn't returning immediately, but rather after about 40 to 50 seconds. This might indicate that a timeout is occurring in Google API Core.
Code
Version
Actual vs expected behavior:
Actual behavior
Expected behavior
Any other information you'd like to share?
Upon reviewing the Gemini API documentation, I noticed an interesting mismatch regarding token limits. While the maximum token count is described as being dependent on the specific model in use. In my case, I'm utilizing the
models/gemini-1.5-flash-001
model, which has a maximum input token limit of 1,048,576. Based on this information, I initially assumed that processing around 500,000 tokens should be working without any issues.Moreover, I was able to successfully generate the cache even with token counts exceeding 800,000 when attempting to create a cache using a string. This leads me to suspect that there might be a bug specifically related to creating cache files with high token counts, as opposed to string-based caching.