Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
12.73k stars 1.23k forks source link

'No code found in the response' error - retry code gen with chatgpt #1128

Closed gDanzel closed 1 week ago

gDanzel commented 5 months ago

System Info

pandasai: 2.0.35 python: 3.11 OS: win11

🐛 Describe the bug

when error correction framework triggered, it's likely the LLM respondes without code delimiter '''' (in my case Chatgpt 3.5 turbo, like 3 out of 5 times), and pandasai returned 'No code found in the response' error and breaks the pipe. If the delimiter are enforced, would it be good to request the LLM for the delimiter of code in response?

import pandasai.pandas as pd
from pandasai import Agent
from pandasai.helpers import get_openai_callback
from pandasai.llm import OpenAI, GoogleGemini

from data.sample_dataframe import dataframe

llm = OpenAI()
agent = Agent([pd.DataFrame(dataframe)], config={"llm": llm, "enforce_privacy": True, "verbose": True, "enable_cache": False,})
with get_openai_callback() as cb:
    response = agent.chat("Calculate average, mean, median of DGP and hapineess_index.")
    print(response)
    print(cb)

The trace

C:\Users\Danzel\PycharmProjects\Tools\venv\Scripts\python.exe C:\Users\Danzel\PycharmProjects\Tools\yunai\privacyenforced.py 
2024-04-22 01:02:15 [INFO] Question: Calculate average, mean, median of DGP and hapineess_index.
2024-04-22 01:02:16 [INFO] Running PandasAI with openai LLM...
2024-04-22 01:02:16 [INFO] Prompt ID: 5b90df7a-261a-4442-a096-3a52c7387557
2024-04-22 01:02:16 [INFO] Executing Pipeline: GenerateChatPipeline
2024-04-22 01:02:16 [INFO] Executing Step 0: ValidatePipelineInput
2024-04-22 01:02:16 [INFO] Executing Step 1: CacheLookup
2024-04-22 01:02:16 [INFO] Executing Step 2: PromptGeneration
2024-04-22 01:02:19 [INFO] Using prompt: <dataframe>
dfs[0]:10x3
country,gdp,happiness_index
Germany,2518917471,7.23
China,5877894258,5.87
Spain,3873144538,6.38
</dataframe>

Update this initial code:
```python
# TODO: import the required dependencies
import pandas as pd

# Write code here

# Declare result var: 
type (possible values "string", "number", "dataframe", "plot"). Examples: { "type": "string", "value": f"The highest salary is {highest_salary}." } or { "type": "number", "value": 125 } or { "type": "dataframe", "value": pd.DataFrame({...}) } or { "type": "plot", "value": "temp_chart.png" }

'### QUERY' Calculate average, mean, median of DGP and hapineess_index.

Variable dfs: list[pd.DataFrame] is already declared.

At the end, declare "result" variable as a dictionary of type and value.

If you are asked to plot a chart, use "matplotlib" for charts, save as png.

Generate python code and return full updated code: 2024-04-22 01:02:19 [INFO] Executing Step 3: CodeGenerator 2024-04-22 01:02:26 [INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 2024-04-22 01:02:26 [INFO] Prompt used:

dfs[0]:10x3 country,gdp,happiness_index Germany,2518917471,7.23 China,5877894258,5.87 Spain,3873144538,6.38

Update this initial code:

# TODO: import the required dependencies
import pandas as pd

# Write code here

# Declare result var: 
type (possible values "string", "number", "dataframe", "plot"). Examples: { "type": "string", "value": f"The highest salary is {highest_salary}." } or { "type": "number", "value": 125 } or { "type": "dataframe", "value": pd.DataFrame({...}) } or { "type": "plot", "value": "temp_chart.png" }

QUERY

Calculate average, mean, median of DGP and hapineess_index.

Variable dfs: list[pd.DataFrame] is already declared.

At the end, declare "result" variable as a dictionary of type and value.

If you are asked to plot a chart, use "matplotlib" for charts, save as png.

Generate python code and return full updated code:

2024-04-22 01:02:26 [INFO] Code generated:

            # TODO: import the required dependencies
import pandas as pd

# Write code here
dfs = []
data = {'country': ['Germany', 'China', 'Spain'],
        'gdp': [2518917471, 5877894258, 3873144538],
        'happiness_index': [7.23, 5.87, 6.38]}
df = pd.DataFrame(data)
dfs.append(df)

# Calculate average, mean, median of GDP and happiness_index
gdp_avg = df['gdp'].mean()
gdp_mean = df['gdp'].median()
gdp_median = df['gdp'].median()

happiness_avg = df['happiness_index'].mean()
happiness_mean = df['happiness_index'].median()
happiness_median = df['happiness_index'].median()

# Declare result var
result = {
    "GDP": {
        "average": gdp_avg,
        "mean": gdp_mean,
        "median": gdp_median
    },
    "happiness_index": {
        "average": happiness_avg,
        "mean": happiness_mean,
        "median": happiness_median
    }
}

2024-04-22 01:02:26 [INFO] Executing Step 4: CachePopulation 2024-04-22 01:02:26 [INFO] Executing Step 5: CodeCleaning 2024-04-22 01:02:26 [INFO] Code running:

data = {'country': ['Germany', 'China', 'Spain'], 'gdp': [2518917471, 5877894258, 3873144538], 'happiness_index': [7.23, 5.87, 6.38]}
df = dfs[0]
dfs.append(df)
gdp_avg = df['gdp'].mean()
gdp_mean = df['gdp'].median()
gdp_median = df['gdp'].median()
happiness_avg = df['happiness_index'].mean()
happiness_mean = df['happiness_index'].median()
happiness_median = df['happiness_index'].median()
result = {'GDP': {'average': gdp_avg, 'mean': gdp_mean, 'median': gdp_median}, 'happiness_index': {'average': happiness_avg, 'mean': happiness_mean, 'median': happiness_median}}

2024-04-22 01:02:26 [INFO] Executing Step 6: CodeExecution 2024-04-22 01:02:26 [ERROR] Failed with error: Traceback (most recent call last): File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 96, in execute if not OutputValidator.validate_result(result): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\helpers\output_validator.py", line 73, in validate_result raise InvalidOutputValueMismatch( pandasai.exceptions.InvalidOutputValueMismatch: Result must be in the format of dictionary of type and value.

2024-04-22 01:02:26 [WARNING] Failed to execute code retrying with a correction framework [retry number: 1] 2024-04-22 01:02:26 [INFO] Executing Pipeline: ErrorCorrectionPipeline 2024-04-22 01:02:26 [INFO] Executing Step 0: ErrorPromptGeneration 2024-04-22 01:02:26 [INFO] Using prompt: dfs[0]:10x3 country,gdp,happiness_index Germany,2518917471,7.23 China,5877894258,5.87 Spain,3873144538,6.38

The user asked the following question:

QUERY

Calculate average, mean, median of DGP and hapineess_index.

You generated this python code: data = {'country': ['Germany', 'China', 'Spain'], 'gdp': [2518917471, 5877894258, 3873144538], 'happiness_index': [7.23, 5.87, 6.38]} df = dfs[0] dfs.append(df) gdp_avg = df['gdp'].mean() gdp_mean = df['gdp'].median() gdp_median = df['gdp'].median() happiness_avg = df['happiness_index'].mean() happiness_mean = df['happiness_index'].median() happiness_median = df['happiness_index'].median() result = {'GDP': {'average': gdp_avg, 'mean': gdp_mean, 'median': gdp_median}, 'happiness_index': {'average': happiness_avg, 'mean': happiness_mean, 'median': happiness_median}}

It fails with the following error: Traceback (most recent call last): File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 96, in execute if not OutputValidator.validate_result(result): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\helpers\output_validator.py", line 73, in validate_result raise InvalidOutputValueMismatch( pandasai.exceptions.InvalidOutputValueMismatch: Result must be in the format of dictionary of type and value.

Fix the python code above and return the new python code: 2024-04-22 01:02:26 [INFO] Executing Step 1: CodeGenerator 2024-04-22 01:02:30 [INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 2024-04-22 01:02:30 [ERROR] Pipeline failed on step 1: No code found in the response 2024-04-22 01:02:30 [ERROR] Pipeline failed on step 6: No code found in the response Traceback (most recent call last): File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 96, in execute if not OutputValidator.validate_result(result): ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\helpers\output_validator.py", line 73, in validate_result raise InvalidOutputValueMismatch( pandasai.exceptions.InvalidOutputValueMismatch: Result must be in the format of dictionary of type and value.

During handling of the above exception, another exception occurred:

Traceback (most recent call last): File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\generate_chat_pipeline.py", line 307, in run output = (self.code_generation_pipeline | self.code_execution_pipeline).run( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\pipeline.py", line 137, in run raise e File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\pipeline.py", line 101, in run step_output = logic.execute( ^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 125, in execute code_to_run = self._retry_run_code( ^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_execution.py", line 346, in _retry_run_code return self.on_retry(code, e) ^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\generate_chat_pipeline.py", line 149, in on_code_retry return self.code_exec_error_pipeline.run(correction_input) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\error_correction_pipeline\error_correction_pipeline.py", line 48, in run return self.pipeline.run(input) ^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\pipeline.py", line 137, in run raise e File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\pipeline.py", line 101, in run step_output = logic.execute( ^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\pipelines\chat\code_generator.py", line 33, in execute code = pipeline_context.config.llm.generate_code(input, pipeline_context) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\llm\base.py", line 202, in generate_code return self._extract_code(response) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "C:\Users\Danzel\PycharmProjects\Tools\venv\Lib\site-packages\pandasai\llm\base.py", line 122, in _extract_code raise NoCodeFoundError("No code found in the response") pandasai.exceptions.NoCodeFoundError: No code found in the response Unfortunately, I was not able to answer your question, because of the following error:

No code found in the response

Process finished with exit code 0

adamingas commented 3 months ago

@gDanzel Is this solved now?