Open anilmadishetty2498 opened 2 months ago
To resolve the MissingVectorStoreError: No vector store provided
error when training using GooglePalm, you need to ensure that you have set up a vector store. Here is how you can do it:
Set the PandasAI API key:
import os
os.environ["PANDASAI_API_KEY"] = "YOUR_PANDASAI_API_KEY"
Instantiate the vector store and pass it to the Agent
:
from pandasai import Agent
from pandasai.ee.vectorstores import ChromaDB
# Instantiate the vector store
vector_store = ChromaDB()
# Instantiate the agent with the custom vector store
db = Agent([scm_vc, scm_rev], config={"llm": llm}, vectorstore=vector_store)
Train the model:
query = "identify top 5 customers by revenue for Indonesia"
code = """
import pandas as pd
top_5 = scm_rev[scm_rev['Country']=='Indonesia'].groupby('Customers').agg({'Revenue':'sum'}).sort_values(by='Revenue', ascending=False).head(5)
#declare results
result = {
"type" : "dataframe",
"value" : top_5
}
"""
db.train(queries=[query], codes=[code])
By following these steps, you should be able to resolve the MissingVectorStoreError
and successfully train your model [1].
To continue talking to Dosu, mention @dosu.
db.train(docs="if asked about variable cost return information from scm_vc table")
ValueError: Number of documents 1 must match number of ids 65
To resolve the ValueError: Number of documents 1 must match number of ids 65
when using db.train
with the docs
parameter, you need to ensure that the number of documents matches the number of IDs provided. The error indicates that you are passing a single document but 65 IDs, which is a mismatch.
Here's how you can fix it:
docs
) matches the number of IDs (ids
).Here is an example:
docs = ["if asked about variable cost return information from scm_vc table"]
ids = ["id1"] # Ensure the number of IDs matches the number of documents
# This will work because the number of docs matches the number of ids
db.train(docs=docs, ids=ids)
If you only have one document but multiple IDs, you need to adjust your input to match the expected format [1][2].
To continue talking to Dosu, mention @dosu.
To resolve the ValueError: Number of documents 1 must match number of ids 65
when using db.train
with the docs
parameter, you need to ensure that the number of documents matches the number of IDs provided. The error indicates that you are passing a single document but 65 IDs, which is a mismatch.
Here's how you can fix it:
docs
) matches the number of IDs (ids
).Here is an example:
docs = ["if asked about variable cost return information from scm_vc table"]
ids = ["id1"] # Ensure the number of ids matches the number of docs
# This will work because the number of docs matches the number of ids
db.add_docs(docs=docs, ids=ids)
If you only have one document but multiple IDs, you need to adjust your input to match the expected format [1][2].
To continue talking to Dosu, mention @dosu.
To resolve the AttributeError: 'Agent' object has no attribute 'add_docs'
, ensure that you are calling the add_docs
method on an instance of the LanceDB
class, not on an Agent
object. The add_docs
method is defined within the LanceDB
class.
Here is an example of how you should instantiate the LanceDB
class and call the add_docs
method:
from pandasai.ee.vectorstores import LanceDB
# Create an instance of LanceDB
db = LanceDB(
table_name="your_table_name",
embedding_function=your_embedding_function, # Optional, if you have a custom embedding function
persist_path="/path/to/persist",
max_samples=10,
similary_threshold=1.5,
logger=your_logger # Optional, if you have a custom logger
)
# Prepare your documents and ids
docs = ["Document 1", "Document 2", "Document 3"]
ids = ["id1", "id2", "id3"]
# Add documents to the database
db.add_docs(docs=docs, ids=ids)
If you are working with the Agent
class, you should ensure that the vector store associated with the Agent
instance has the appropriate methods. Here is an example of how the Agent
class interacts with the vector store:
from pandasai import Agent
from unittest.mock import MagicMock
# Create an instance of Agent
agent = Agent(sample_df, config, vectorstore=MagicMock())
# Prepare your documents
docs = ["Document 1", "Document 2", "Document 3"]
# Add documents to the vector store through the agent
agent._vectorstore.add_docs(docs)
Make sure that the vectorstore
object associated with the Agent
instance has the add_docs
method. This should help you avoid the AttributeError
[1][2][3].
To continue talking to Dosu, mention @dosu.
System Info
python==3.11.7
🐛 Describe the bug
from pandasai.llm import GooglePalm llm = GooglePalm(api_key=GOOGLE_API_KEY)
db = Agent([scm_vc, scm_rev], config={"llm": llm})
Example #2: train the model with Q/A
query = "identify top 5 customers by revenue for Indonesia" code = """ import pandas as pd top_5 = scm_rev[scm_rev['Country']=='Indonesia'].groupby('Customers').agg({'Revenue':'sum'}).sort_values(by='Revenue', ascending=False).head(5)
declare results
result = { "type" : "dataframe", "value" : top_5 } """
db.train(queries=[query], codes=[code])
MissingVectorStoreError: No vector store provided. Please provide a vector store to train the agent.