Save the training model to local disk, then reload from the local file to avoid the re-training process everytime.

vanna-ai / vanna

🤖 Chat with your SQL database 📊. Accurate Text-to-SQL Generation via LLMs using RAG 🔄.

MIT License

9.12k stars 673 forks source link

I am doing this using built-in LocalContext_OpenAI and it saves/load the chroma db:

from vanna.local import LocalContext_OpenAI

chroma_path = # get the path you want, e.g. os.environ.get("VANNA_CHROMA_PATH", "./vanna-db")
vn = LocalContext_OpenAI(
    config={
        "api_key": "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",
        "model": "gpt-4o",
        "path": chroma_path,
    }
)

When using the LocalContext_OpenAI, it just passes the config to both LLM (OpenAI_Chat) and ChromaDB (ChromaDB_VectorStore) vanna wrappers. In the latter, it expects a path config entry which is passed to the chrome client.

You can see that in ./src/vanna/chromadb/chromadb_vector.py. But I think by default it will create the db in the CWD and will reload it from there even if you don't specify a path.

If you are using your own implementation of Vanna, you will have to do something similar and pass path to ChromaDB_VectorStore (assuming you are using it)

vanna-ai / vanna

Save the training model to local disk, then reload from the local file to avoid the re-training process everytime. #477