[Feature]: Find a more suitable similarity evaluation model

Is your feature request related to a problem? Please describe.

Currently, GPTCache achieves an accuracy of approximately 80% in optimal conditions. However, during daily usage, it often returns unsatisfactory cache results, indicating that it does not perform well in practical production scenarios.

GPTCache utilizes user inquiries to retrieve similar data from the vector database and calculates the similarity between the user inquiries and the questions in the retrieved data. However, the effectiveness of this approach is not ideal. We attempted to improve the results by employing the cross-encoder/quora-distilroberta-base model to match answers in the retrieved data with user inquiries. Unfortunately, the outcomes were not satisfactory.

Is there another model available that better suits this particular scenario?

Describe the solution you'd like.

No response

Describe an alternate solution.

No response

Anything else? (Additional Context)

No response

zilliztech / GPTCache