Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
11.69k stars 1.08k forks source link

'NoneType' object is not subscriptable #783

Open bennofatius opened 7 months ago

bennofatius commented 7 months ago

System Info

OS: macOS 13.4.1 Python version: 3.11.5 pandasai version: 1.5.4

πŸ› Describe the bug

With an example as simple as below, I get this response: ""Unfortunately, I was not able to answer your question, because of the following error:\n\n'NoneType' object is not subscriptable\n""

import pandas as pd
from pandasai import SmartDataframe

from pandasai.llm import OpenAI
llm = OpenAI(api_token="<myToken>")

df = pd.DataFrame({
    'Name': ['Ben', 'Anita', 'Frank', 'Alfred'],
    'Favorite color': ['Red', 'Blue', 'Green', 'Yellow'],
    'Socks owned': [7, 16, 22, 3],
})

df = SmartDataframe(df, config={"llm": llm})

df.chat('Who owns most socks?')

This is what's happening in the logs:

The user asked the following question: Q: Who owns most socks?

You generated this python code: TODO: import the required dependencies import pandas as pd

Write code here socks_owned = [] for df in dfs: socks_owned.append(df['Socks owned'].sum())

max_socks_index = socks_owned.index(max(socks_owned)) owner = dfs[max_socks_index]['Name'][0]

result = {"type": "string", "value": f"{owner} owns the most socks."}

It fails with the following error: Traceback (most recent call last): File β€œ<…>/lib/python3.11/site-packages/pandasai/pipelines/smart_datalake_chat/code_execution.py", line 46, in execute result = pipeline_context.query_exec_tracker.execute_func( ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "<…>/lib/python3.11/site-packages/pandasai/helpers/query_exec_tracker.py", line 128, in execute_func result = function(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^ File "<…>/lib/python3.11/site-packages/pandasai/helpers/code_manager.py", line 203, in execute_code exec(code_to_run, environment) File "", line 3, in TypeError: 'NoneType' object is not subscriptable

emanueleparini commented 7 months ago

Sometimes it happens to me too, especially with dates.

gventuri commented 7 months ago

@bennofatius thanks a lot for reporting. In this case it seems it's an hallucination to me. The code generated doesn't seem to make a lot of sense to me. We can investigate a little bit into it though!

bennofatius commented 7 months ago

Thanks that'd be great! I tried with both gpt-3.5-turbo and gpt-4-1106-preview but both created the same result.

emanueleparini commented 7 months ago

when I used version 1.5.1 if I asked for information on the dates it always came back fine, since 1.5.4 every now and then I get this error. Using OpenAI

Shudh commented 6 months ago

I just spent some time with this error..as I am also affected by this..with a debugger I can see that dfs list of df which should be a list of df is None ... in the environment ..The openai code was just fine..but the df I created from csv and passed to smartdataframe was not passed to the LLM generated code.. I got too tired..but I may try to find the root cause again ..

Shudh commented 6 months ago

Here is a complete debug trace.. where you can see clearly that it is not hallucination.. The LLM generated code via openai ..is correct but the list of dataframes being sent to it is clearly empty .. In 3 months the project has developed considerably with Smart entities and abstractions ..so it would be faster if you couldhave a look...I may return to it...For now..I will wait before integrating pandasai and use function calling of openai ....Anyways...this is a wonderful tool... For now It fails with the following error: log and also my half an hour debug video - pandasai.log https://drive.google.com/file/d/1PLHwo_JgfC5PHG3lDb1DNL7SxbhpYdBI/view?usp=drive_link

Traceback (most recent call last): File "/home/shudh/PycharmProjects/pandas-ai-Dec4-23/pandasai/pipelines/smart_datalake_chat/code_execution.py", line 46, in execute result = pipeline_context.query_exec_tracker.execute_func( File "/home/shudh/PycharmProjects/pandas-ai-Dec4-23/pandasai/helpers/query_exec_tracker.py", line 128, in execute_func result = function(*args, **kwargs) File "/home/shudh/PycharmProjects/pandas-ai-Dec4-23/pandasai/helpers/code_manager.py", line 205, in execute_code exec(code_to_run, environment) File "", line 3, in TypeError: 'NoneType' object is not subscriptable

Fix the python code above and return the new python code: 2023-12-05 08:25:37 [INFO] HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK" 2023-12-05 08:25:37 [INFO] Saving charts to /home/shudh/PycharmProjects/pandas-ai-Dec4-23/exports/charts/temp_chart.png 2023-12-05 08:25:37 [INFO] Code running:

symbol_wise_pnl = []
for df in dfs:
    symbol = df['Instrument'].iloc[0]
    pnl = df['P&L'].sum()
    symbol_wise_pnl.append({'Symbol': symbol, 'P&L': pnl})
symbol_wise_pnl_df = pd.DataFrame(symbol_wise_pnl)
result = {'type': 'dataframe', 'value': symbol_wise_pnl_df}
Shudh commented 6 months ago

Please let me know if somewhere in the video you can see my openai keys.. I hope not :(

hywxs1993 commented 6 months ago

I got very similar problem, and for my case, the problem is in _required_dfs in pandas-ai/pandasai/helpers/code_manager.py. This function will filter df list like,

for i, df in enumerate(self._dfs):
    if f"dfs[{i}]" in code:
        required_dfs.append(df)
    else:
        required_dfs.append(None)

As I mentioned in https://github.com/gventuri/pandas-ai/issues/791#issuecomment-1842792416

Shudh commented 6 months ago

Ok I will add a breakpoint here..and update my findings..

gventuri commented 6 months ago

What I recommend at the moment is reverting to a previous version (for example 1.4.7 or 1.5.1). We are working hard on fixing this issue and we'll keep you updated about the new developments!

gventuri commented 6 months ago

This is most likely an issue related to the prompt and the generated output, but we are also considering different scenarios.

Shudh commented 6 months ago

pandasai.log Hi Gabriele, As discussed, I have checked out v 1,5.1 and I have changed absolutely nothing else...no prompt change ..no code change...no csv file path changed.. compared to the time I was running the latest version... and this particular case ran fine...as the verbose was on...So, I have also a detailed log...that contains working path and non working path...I hope it would help you and team to resolve this...Thanks a lot for your wonderful library... Regards Shudh

gventuri commented 6 months ago

Thanks a lot to each of you for the contribution! This very bug should have been fixed as of 1.5.7, but I've noticed sometimes GPT-3.5/4 is using for loops for no reason, and we're trying also to fix it with the prompt.

Please, try it out and let me know :)

Jimchoo91 commented 2 months ago

Still and issue using the latest PandasAI and GPT-4...