litellm.gpt_cache should be disabled for num>1 in run call

yujonglee / eval

Evaluate your LLM apps, RAG pipeline, any generated text, and more!

MIT License

0 stars 0 forks source link

Closed yujonglee closed 1 year ago

yujonglee commented 1 year ago

But for other situations, I want to have cache enabled since it save time + money while doing experimenting.

yujonglee commented 1 year ago

LiteLLM recently added completion(..., cache=False) so it is exactly what I need.

Blocker is:

litellm's cache only support Redis. (I need to run instance and provide host/port), which is inconvenient.

yujonglee commented 1 year ago

If we add something similar to LiteLLM, it will be something like:

# https://docs.litellm.ai/docs/caching/#using-redis-cache-with-litellm
litellm.cache = Cache(..., type="memory" | "disk" | "redis")

yujonglee commented 1 year ago

We can give-up the disk cache, and use in-memory cache.