zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
https://gptcache.readthedocs.io
MIT License
6.96k stars 490 forks source link

Distributed cache feature using Redis #518

Closed a9raag closed 11 months ago

a9raag commented 11 months ago

Distributed Cache using Redis #496

Feature for distributed cache. This allows us to store keys in Redis key-value store. Sorry, it took a while I spent too much time thinking about a design for this and how it will integrate into GPTCache's existing code base.

I have also attached a high-level design of how Distributed Cache works.

Notes

A couple of notes about how the eviction will work based on different scalar storage configurations:

Scenario 1: Scalar Storage is MongoDB, Eviction Base is Redis. Then the keys will be stored in Redis, and cache call back will update the TTL.

Scenario 2: Scalar Storage is Redis, Eviction Base is also Redis, Then storing keys separately becomes redundant since Redis eviction is based on Memory usage rather than the number of keys. Also, we'll be fetching values multiple times both while getting scalar data and while hitting cache callback, and it will add network overhead.

To tackle this I created a NoOpEviction implementation (which skips cache operations) since,

  1. Redis internally manage the eviction based on memory size and eviction policy for all keys.
  2. We can configure RedisCacheStorage to update TTL upon creation and during access. This way values will be fetched from Redis only once.

GPT Distributed Cache

a9raag commented 11 months ago

@SimFG Can you help the test cases are failing because of

'(ReadTimeoutError("HTTPSConnectionPool(host='huggingface.co', port=443): Read timed out. (read timeout=10.0)"),
 '(Request ID: 3afc7015-e7b6-4b6a-abd6-acd698b9469f)')' thrown while requesting GET 
https://huggingface.co/GPTCache/paraphrase-albert-small-v2/resolve/main/tokenizer.json
SimFG commented 11 months ago

@a9raag ok, I have rerun it

a9raag commented 11 months ago

@SimFG it failed again

codecov[bot] commented 11 months ago

Codecov Report

Merging #518 (ce8db36) into dev (5aa7e1e) will decrease coverage by 0.16%. Report is 1 commits behind head on dev. The diff coverage is 93.33%.

Additional details and impacted files [![Impacted file tree graph](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518/graphs/tree.svg?width=650&height=150&src=pr&token=E30WxqBeJJ&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech)](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech) ```diff @@ Coverage Diff @@ ## dev #518 +/- ## ========================================== - Coverage 94.05% 93.89% -0.16% ========================================== Files 94 95 +1 Lines 3920 4014 +94 ========================================== + Hits 3687 3769 +82 - Misses 233 245 +12 ``` | [Files Changed](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech) | Coverage Δ | | |---|---|---| | [gptcache/manager/eviction/memory\_cache.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9ldmljdGlvbi9tZW1vcnlfY2FjaGUucHk=) | `90.90% <ø> (ø)` | | | [gptcache/manager/eviction/distributed\_cache.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9ldmljdGlvbi9kaXN0cmlidXRlZF9jYWNoZS5weQ==) | `89.65% <89.65%> (ø)` | | | [gptcache/manager/scalar\_data/redis\_storage.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9zY2FsYXJfZGF0YS9yZWRpc19zdG9yYWdlLnB5) | `96.13% <93.18%> (-1.51%)` | :arrow_down: | | [gptcache/\_\_init\_\_.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvX19pbml0X18ucHk=) | `100.00% <100.00%> (ø)` | | | [gptcache/manager/data\_manager.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9kYXRhX21hbmFnZXIucHk=) | `89.60% <100.00%> (+0.10%)` | :arrow_up: | | [gptcache/manager/eviction/manager.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9ldmljdGlvbi9tYW5hZ2VyLnB5) | `100.00% <100.00%> (ø)` | | | [gptcache/manager/factory.py](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518?src=pr&el=tree&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech#diff-Z3B0Y2FjaGUvbWFuYWdlci9mYWN0b3J5LnB5) | `96.55% <100.00%> (+0.89%)` | :arrow_up: | ... and [1 file with indirect coverage changes](https://app.codecov.io/gh/zilliztech/GPTCache/pull/518/indirect-changes?src=pr&el=tree-more&utm_medium=referral&utm_source=github&utm_content=comment&utm_campaign=pr+comments&utm_term=zilliztech)
a9raag commented 11 months ago

@SimFG apologies if I keep bugging you but the Unit Test workflow failed again 😭

Here's the error log

/home/runner/work/_temp/c7cf298b-8f2b-475f-995a-f3f5562139f1.sh: line 3:  3496 Illegal instruction    
 (core dumped) python3 -m pytest -k "not embedding and not processor" --cov=gptcache --cov-report xml:coverage.xml --cov-append ./tests/
Error: Process completed with exit code 132.
a9raag commented 11 months ago

Hi @SimFG and @xiaofan-luan I've updated the PR. The Unit Tests and CodeCov have now passed. Could you please review it when you get a chance? Thanks : )

SimFG commented 11 months ago

@a9raag great!!! i will check it today

SimFG commented 11 months ago

/lgtm /approve

sre-ci-robot commented 11 months ago

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: a9raag, SimFG

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files: - ~~[OWNERS](https://github.com/zilliztech/GPTCache/blob/dev/OWNERS)~~ [SimFG] Approvers can indicate their approval by writing `/approve` in a comment Approvers can cancel approval by writing `/approve cancel` in a comment