Closed yujonglee closed 1 year ago
LiteLLM recently added completion(..., cache=False)
so it is exactly what I need.
Blocker is:
litellm
's cache only support Redis
. (I need to run instance and provide host/port), which is inconvenient.Current data manager that fastrepl
is using:
https://github.com/zilliztech/GPTCache/blob/790e9f4211929a0b06b640f9c77b54e842b183af/gptcache/manager/data_manager.py#L88
If we add something similar to LiteLLM, it will be something like:
# https://docs.litellm.ai/docs/caching/#using-redis-cache-with-litellm
litellm.cache = Cache(..., type="memory" | "disk" | "redis")
We can give-up the disk cache, and use in-memory cache.
If we enable cache, there is no point running same evaluation multiple times to check consistency. https://github.com/fastrepl/fastrepl/blob/2cbf1de3559a68a3f585ea64f176dfe0aa8b93c5/fastrepl/runner.py#L58
But for other situations, I want to have cache enabled since it save time + money while doing experimenting.