Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
12.71k stars 1.23k forks source link

Column MultiIndex support #1150

Closed iAbadia closed 1 month ago

iAbadia commented 4 months ago

🚀 The feature

At the moment I'm unable to use an Agent with a DataFrame that has MultiIndex columns, I would like to be able to do this.

Motivation, pitch

MultiIndex DataFrames let the user organise data in a more meaningful way, which can also guide the Agent towards better answers.

Alternatives

Flattening the MultiIndex into a simple Index

Additional context

Here's the error I get when I try to query a SmartDataframe with a MultiIndex columns attribute.

Traceback (most recent call last):
  File "../site-packages/pandasai/pipelines/chat/generate_chat_pipeline.py", line 307, in run
    output = (self.code_generation_pipeline | self.code_execution_pipeline).run(
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "../site-packages/pandasai/pipelines/pipeline.py", line 137, in run
    raise e
  File "../site-packages/pandasai/pipelines/pipeline.py", line 101, in run
    step_output = logic.execute(
                  ^^^^^^^^^^^^^^
  File "../site-packages/pandasai/pipelines/chat/cache_lookup.py", line 36, in execute
    pipeline_context.cache.get_cache_key(pipeline_context)
  File "../site-packages/pandasai/helpers/cache.py", line 100, in get_cache_key
    cache_key += str(df.column_hash)
                     ^^^^^^^^^^^^^^
  File "../site-packages/pandasai/connectors/pandas.py", line 131, in column_hash
    columns_str = "".join(self.pandas_df.columns)
                  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: sequence item 0: expected str instance, tuple found