zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
https://gptcache.readthedocs.io
MIT License
7.15k stars 503 forks source link

[Feature]: Find a more suitable similarity evaluation model #549

Open SimFG opened 12 months ago

SimFG commented 12 months ago

Is your feature request related to a problem? Please describe.

Currently, GPTCache achieves an accuracy of approximately 80% in optimal conditions. However, during daily usage, it often returns unsatisfactory cache results, indicating that it does not perform well in practical production scenarios.

GPTCache utilizes user inquiries to retrieve similar data from the vector database and calculates the similarity between the user inquiries and the questions in the retrieved data. However, the effectiveness of this approach is not ideal. We attempted to improve the results by employing the cross-encoder/quora-distilroberta-base model to match answers in the retrieved data with user inquiries. Unfortunately, the outcomes were not satisfactory.

Is there another model available that better suits this particular scenario?

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

Vasanthavel124 commented 12 months ago

can you assign this to me?