zilliztech / GPTCache

Semantic cache for LLMs. Fully integrated with LangChain and llama_index.
https://gptcache.readthedocs.io
MIT License
7.06k stars 500 forks source link

Request to support loading of Cache of previously saved multi turn Dialogue using OpenAI #479

Closed athenasaurav closed 1 year ago

athenasaurav commented 1 year ago

I am running the following code to talk to a bot created using prompts on OpenAI :

import time
from gptcache import cache
from gptcache.adapter import openai
from gptcache.embedding import Onnx
from gptcache.manager import CacheBase, VectorBase, get_data_manager
from gptcache.similarity_evaluation.distance import SearchDistanceEvaluation

def response_text(openai_resp):
    return openai_resp['choices'][0]['message']['content']

print("Cache loading.....")

onnx = Onnx()
data_manager = get_data_manager(CacheBase("sqlite"), VectorBase("faiss", dimension=onnx.dimension))
cache.init(
    embedding_func=onnx.to_embeddings,
    data_manager=data_manager,
    similarity_evaluation=SearchDistanceEvaluation(),
    )
cache.set_openai_key()

conversation = [
    {
        "role": "system",
        "content": "<Prompt for Open AI"
    }
]

while True:
    question = input("Please enter your question (or 'exit' to stop): ")

    if question.lower() == 'exit':
        break

    conversation.append({
        'role': 'user',
        'content': question
    })

    start_time = time.time()
    response = openai.ChatCompletion.create(
        model='gpt-3.5-turbo',
        messages=conversation,
    )

    answer = response_text(response)
    conversation.append({
        'role': 'assistant',
        'content': answer
    })

    # Here we are saving the conversation as it happens
    embedding_data = onnx.to_embeddings(question)
    data_manager.save(question, answer, embedding_data)

    print(f'Question: {question}')
    print("Time consuming: {:.2f}s".format(time.time() - start_time))
    print(f'Answer: {answer}\n')

Now I want to check if any previous cache is available and if available I want to load them so that it is not a fresh start every time I start. How can I do that?

SimFG commented 1 year ago

hi, @athenasaurav By default, the openai adapter will save the last round of the dialogue in the cache, that is, data_manager.save(question, answer, embedding_data) is not necessary.

In addition, I don't quite understand what is meant by whether the cache is available? Does it mean that you want to periodically refresh the cache to the file or?

If yes, SQLite should be written to the file every time it is executed, but faiss will not. You can manually execute cache.flush() to save the contents of the memory to the file.

athenasaurav commented 1 year ago

Hello @SimFG

Thank you for your prompt reply. I got that I don't need to save the conversation using data_manager.save.

What I meant was that I want to have multiple parallel conversations and all should save the previous conversations' questions and answers in data_manager. So that if a new conversation is happening, answering the previous conversation should help in answering the user.

To explain more in detail, The 1st conversation has a few questions, answered and when a new session with a new user is initiated then if he asks a question similar to a previous user's question then the cache should be triggered.

Thank you for the cache.flush part.

Also is there a way to connect to my Azure OpenAI deployment for OpenAI and also to rename the Sqlite and faiss index to something else rather than the default names? I can't see any custom name option in adapters to be passed.

SimFG commented 1 year ago

@athenasaurav

What I meant was that I want to have multiple parallel conversations and all should save the previous conversations' questions and answers in data_manager. So that if a new conversation is happening, answering the previous conversation should help in answering the user.

you need to use other store way, because the sqlite and faiss store the data by the local file. you need use the other store combination, reference: https://gptcache.readthedocs.io/en/latest/configure_it.html#data-manager

Also is there a way to connect to my Azure OpenAI deployment for OpenAI

before using the cache, you can set it like below, and then you will use the Azure OpenAI, doc: Azure OpenAI Completions image

to rename the Sqlite and faiss index to something else rather than the default names?

of course. you can use the manager_factory method and set the data_dir param, which will include the store files. And if you want to change the store file name, you need to check the store reference. For example, for sqlite, you can set the sql_url param; for faiss, you can set the index_path param

athenasaurav commented 1 year ago

@athenasaurav

What I meant was that I want to have multiple parallel conversations and all should save the previous conversations' questions and answers in data_manager. So that if a new conversation is happening, answering the previous conversation should help in answering the user.

you need to use other store way, because the sqlite and faiss store the data by the local file. you need use the other store combination, reference: https://gptcache.readthedocs.io/en/latest/configure_it.html#data-manager

Also is there a way to connect to my Azure OpenAI deployment for OpenAI

before using the cache, you can set it like below, and then you will use the Azure OpenAI, doc: Azure OpenAI Completions

image

to rename the Sqlite and faiss index to something else rather than the default names?

of course. you can use the manager_factory method and set the data_dir param, which will include the store files. And if you want to change the store file name, you need to check the store reference. For example, for sqlite, you can set the sql_url param; for faiss, you can set the index_path param

Thanks @SimFG for response. Will surely check it.

SimFG commented 1 year ago

hi, @athenasaurav i will close the issue. And you encounter other problem about the GPTCache issue, please open a new issue