Open arthurbrenno opened 6 days ago
Hi @arthurbrenno and thanks for the feature request!
How do you envision the usage of the cache in LlamaIndex? I'm not sure what the UX should be for two reasons here:
If you have any suggestion about how we could integrate the feature, even in pseudo-code, that would help me a lot!
Feature Description
Google is now proving a new way to reduce costs by caching input tokens to be referenced in subsequent requests. It would be nice to have this implemented.
Docs: https://ai.google.dev/gemini-api/docs/caching?utm_source=gais&utm_medium=email&utm_campaign=june&lang=python
Reason
This can help gemini users reduce costs when larger system prompts are using or even intensive agent tasks.
Value of Feature
No response