Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
13.14k stars 1.27k forks source link

wrong result or no result when using llama3.1 / codellama #1326

Open tobias-schuele opened 2 months ago

tobias-schuele commented 2 months ago

System Info

Apple M2, Sonoma 14.6 (23G80), Python 3.12.5, pandasai 2.2.14

🐛 Describe the bug

The getting started example (https://docs.pandas-ai.com/library#smartdataframe) produces a wrong result when using llama3.1:

import pandas as pd
from pandasai import SmartDataframe
from pandasai.llm.local_llm import LocalLLM

ollama_llm = LocalLLM(api_base="http://localhost:11434/v1", model="llama3.1")

# Sample DataFrame
sales_by_country = pd.DataFrame({
    "country": ["United States", "United Kingdom", "France", "Germany", "Italy", "Spain", "Canada", "Australia", "Japan", "China"],
    "sales": [5000, 3200, 2900, 4100, 2300, 2100, 2500, 2600, 4500, 7000]
})

df = SmartDataframe(sales_by_country, config={"llm": ollama_llm})
df.chat('Which are the top 5 countries by sales?')

Output (actually, the bottom 5 countries ;-):

{'type': 'dataframe', 'value':      country  sales
2     France   2900
7  Australia   2600
6     Canada   2500
4      Italy   2300
5      Spain   2100}

When using model="codellama", the following error occurs:

Traceback (most recent call last):
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/chat/generate_chat_pipeline.py", line 335, in run
    ).run(input)
      ^^^^^^^^^^
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/pipeline.py", line 137, in run
    raise e
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/pipeline.py", line 101, in run
    step_output = logic.execute(
                  ^^^^^^^^^^^^^^
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 113, in execute
    raise e
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 85, in execute
    result = self.execute_code(code_to_run, code_context)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/tobias/venv/lib/python3.12/site-packages/pandasai/pipelines/chat/code_execution.py", line 175, in execute_code
    raise NoResultFoundError("No result returned")
pandasai.exceptions.NoResultFoundError: No result returned
dosubot[bot] commented 2 months ago

Hey @tobias-schuele! I'm here to help you out with any bugs or questions you might have. Let's work on solving this together!

To resolve the issues with the pandasai library when using the llama3.1 and codellama models, you can use the HuggingFace text-generation inference server. Here is how you can set it up:

  1. Serve a supported large language model (LLM) using HuggingFace's text-generation inference server. You can find more information about setting up an inference server here.

  2. Instantiate the HuggingFaceTextGen model with the inference server URL:

from pandasai.llm import HuggingFaceTextGen
from pandasai import SmartDataframe

# Replace with your inference server URL
llm = HuggingFaceTextGen(inference_server_url="http://127.0.0.1:8080")
df = SmartDataframe("data.csv", config={"llm": llm})

response = df.chat("Which are the top 5 countries by sales?")
print(response)

This setup should help mitigate the issues with llama3.1 returning incorrect results and codellama throwing a NoResultFoundError [1].

To continue talking to Dosu, mention @dosu.