yujonglee / eval

Evaluate your LLM apps, RAG pipeline, any generated text, and more!
MIT License
0 stars 0 forks source link

litellm.gpt_cache should be disabled for num>1 in run call #98

Closed yujonglee closed 1 year ago

yujonglee commented 1 year ago

If we enable cache, there is no point running same evaluation multiple times to check consistency. https://github.com/fastrepl/fastrepl/blob/2cbf1de3559a68a3f585ea64f176dfe0aa8b93c5/fastrepl/runner.py#L58

But for other situations, I want to have cache enabled since it save time + money while doing experimenting.

yujonglee commented 1 year ago

LiteLLM recently added completion(..., cache=False) so it is exactly what I need.

Blocker is:

Current data manager that fastrepl is using: https://github.com/zilliztech/GPTCache/blob/790e9f4211929a0b06b640f9c77b54e842b183af/gptcache/manager/data_manager.py#L88

yujonglee commented 1 year ago

If we add something similar to LiteLLM, it will be something like:

# https://docs.litellm.ai/docs/caching/#using-redis-cache-with-litellm
litellm.cache = Cache(..., type="memory" | "disk" | "redis")
yujonglee commented 1 year ago

We can give-up the disk cache, and use in-memory cache.