Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
I have installed the latest version of pandasai last week. Machine - Ubuntu 22.04 RAM 32 GB. using openai.
I have a table with list of holidays for the year 2024 for 4 state [county] of INDIA. the columns are 'Date', "day', 'Festival', 'Region'. When I ask a Question - "When is the next holiday in Pune? Here Pune is a city which falls under one of the states mentioned in the csv file. So LLM should interpret that the holiday date should be picked from the listed State [to which Pune belongs]- This works well with Assistants AI. HOwever, using pandasai, it is not working. Since it explicitly tries to find the 'Pune' in the list, it is not able to return the answer.
🐛 Describe the bug
here are the logs:
dfs[0]:39x5
No,Date,Day,Occasion/Festival,Location/State
17,01/05/24,Monday,Good Friday,Telangana
3,19/02/24,Wednesday,Diwali Amavasya (Laxmi Pujan),Rest of India
6,29/03/24,Thursday,Independence Day,Karnataka
The user asked the following question:
QUERY
when is the next holiday in Pune?
You generated this python code:
pune_holidays = dfs[0][dfs[0]['Location/State'] == 'Pune']
pune_holidays['Date'] = pd.to_datetime(pune_holidays['Date'], format='%d/%m/%y')
pune_holidays = pune_holidays.sort_values(by='Date')
next_holiday = pune_holidays[pune_holidays['Date'] > pd.Timestamp.now()].iloc[0]
next_holiday_date = next_holiday['Date'].strftime('%d/%m/%y')
result = {'type': 'string', 'value': f"The next holiday in Pune is on {next_holiday_date}, which is {next_holiday['Occasion/Festival']}."}
2024-06-05 16:24:30 [INFO] Code generated:
import pandas as pd
pune_holidays = dfs[0][dfs[0]['Location/State'] == 'Pune']
pune_holidays['Date'] = pd.to_datetime(pune_holidays['Date'], format='%d/%m/%y')
pune_holidays = pune_holidays.sort_values(by='Date')
next_holiday = pune_holidays[pune_holidays['Date'] > pd.Timestamp.now()]
if not next_holiday.empty:
next_holiday = next_holiday.iloc[0]
next_holiday_date = next_holiday['Date'].strftime('%d/%m/%y')
result = {'type': 'string', 'value': f"The next holiday in Pune is on {next_holiday_date}, which is {next_holiday['Occasion/Festival']}."}
else:
result = {'type': 'string', 'value': "There are no upcoming holidays in Pune."}
result
pune_holidays = dfs[0][dfs[0]['Location/State'] == 'Pune']
pune_holidays['Date'] = pd.to_datetime(pune_holidays['Date'], format='%d/%m/%y')
pune_holidays = pune_holidays.sort_values(by='Date')
next_holiday = pune_holidays[pune_holidays['Date'] > pd.Timestamp.now()]
if not next_holiday.empty:
next_holiday = next_holiday.iloc[0]
next_holiday_date = next_holiday['Date'].strftime('%d/%m/%y')
result = {'type': 'string', 'value': f"The next holiday in Pune is on {next_holiday_date}, which is {next_holiday['Occasion/Festival']}."}
else:
result = {'type': 'string', 'value': 'There are no upcoming holidays in Pune.'}
result
2024-06-05 16:24:30 [INFO] Executing Step 7: ResultValidation
2024-06-05 16:24:30 [INFO] Answer: {'type': 'string', 'value': 'There are no upcoming holidays in Pune.'}
2024-06-05 16:24:30 [INFO] Executing Step 8: ResultParsing
System Info
I have installed the latest version of pandasai last week. Machine - Ubuntu 22.04 RAM 32 GB. using openai. I have a table with list of holidays for the year 2024 for 4 state [county] of INDIA. the columns are 'Date', "day', 'Festival', 'Region'. When I ask a Question - "When is the next holiday in Pune? Here Pune is a city which falls under one of the states mentioned in the csv file. So LLM should interpret that the holiday date should be picked from the listed State [to which Pune belongs]- This works well with Assistants AI. HOwever, using pandasai, it is not working. Since it explicitly tries to find the 'Pune' in the list, it is not able to return the answer.
🐛 Describe the bug
here are the logs:
The user asked the following question:
QUERY
when is the next holiday in Pune?
You generated this python code: pune_holidays = dfs[0][dfs[0]['Location/State'] == 'Pune'] pune_holidays['Date'] = pd.to_datetime(pune_holidays['Date'], format='%d/%m/%y') pune_holidays = pune_holidays.sort_values(by='Date') next_holiday = pune_holidays[pune_holidays['Date'] > pd.Timestamp.now()].iloc[0] next_holiday_date = next_holiday['Date'].strftime('%d/%m/%y') result = {'type': 'string', 'value': f"The next holiday in Pune is on {next_holiday_date}, which is {next_holiday['Occasion/Festival']}."} 2024-06-05 16:24:30 [INFO] Code generated:
2024-06-05 16:24:30 [INFO] Executing Step 2: CodeCleaning 2024-06-05 16:24:30 [INFO] Code running:
2024-06-05 16:24:30 [INFO] Executing Step 7: ResultValidation 2024-06-05 16:24:30 [INFO] Answer: {'type': 'string', 'value': 'There are no upcoming holidays in Pune.'} 2024-06-05 16:24:30 [INFO] Executing Step 8: ResultParsing