Sinaptik-AI / pandas-ai

Chat with your database (SQL, CSV, pandas, polars, mongodb, noSQL, etc). PandasAI makes data analysis conversational using LLMs (GPT 3.5 / 4, Anthropic, VertexAI) and RAG.
https://pandas-ai.com
Other
13.07k stars 1.26k forks source link

No plotting output with PandasAI 0.6.5 #323

Closed aiakubovich closed 4 months ago

aiakubovich commented 1 year ago

🐛 Describe the bug

Hi. I have streamlit app that have chat functionality. The core function of the app is:

    def get_agent_response(self, uploaded_file_content, query, plot_container):

        from pandasai import PandasAI
        # from pandasai.llm.openai import OpenAI
        from pandasai.llm.azure_openai import AzureOpenAI

        llm = AzureOpenAI(
                            api_token = os.environ["OPENAI_API_KEY"], 
                            api_base = os.environ["OPENAI_API_BASE"], 
                            deployment_name = os.environ["OPENAI_API_MODEL_NAME"],
                            model_name = os.environ["OPENAI_API_MODEL_NAME"]
                            )
        pandas_ai = PandasAI(llm, verbose=True)
        old_stdout = sys.stdout
        sys.stdout = captured_output = StringIO()

        response = pandas_ai.run(data_frame = uploaded_file_content, prompt=query)
        fig = plt.gcf()
        if fig.get_axes():
            # Adjust the figure size
            fig.set_size_inches(12, 6)

            # Adjust the layout tightness
            plt.tight_layout()
            buf = BytesIO()
            fig.savefig(buf, format="png")
            buf.seek(0)
            with plot_container:
                st.image(buf, caption="Generated Plot")

        sys.stdout = old_stdout
        return response, captured_output

This function works for both 0.2.15 and 0.6.5 if output is textual. But if output is plots this function works only with 0.2.15 and does not output anything with 0.6.5. I was unable to find the reason why. The full example of chatbot can be found here: https://github.com/yvann-hub/Robby-chatbot/blob/main/src/pages/2_%F0%9F%93%8A%20Robby-Sheet%20(beta).py

Looks like PandasAI bug.

aiakubovich commented 1 year ago

some investigation:

if query is plot horizontal bar chart for entries in "tier" column. then

For 0.2.15 i am getting response as Sure, I can help you with that! To plot a horizontal bar chart for the entries in the "tier" column, we can use a data visualization tool like Excel or Google Sheets. This will allow us to easily create a visual representation of the data that shows the frequency of each tier. Would you like me to walk you through the steps? plus i am getting plot as output

For 0.6.5 I am getting response as None and no plot

gventuri commented 1 year ago

@aiakubovich thanks a lot for reporting. We have released streamlit middleware to plot charts using streamlit. Check it out: https://pandas-ai.readthedocs.io/en/latest/middlewares/#streamlitmiddleware

Let me know if it fixes the issue!

aiakubovich commented 1 year ago

Hi @gventuri , thank you for response. I tried to use streamlit middleware but I am still not getting any plots.

I also was trying to create simple example in Jupyter Labs but getting response variable as None

image

code:

from pandasai import PandasAI
from pandasai.llm.azure_openai import AzureOpenAI
import pandas as pd

data = {'tier': ['A', 'B', 'A', 'C', 'B', 'A', 'C']}
uploaded_file_content = pd.DataFrame(data)

llm = AzureOpenAI(
    api_token=api_token,
    api_base=api_base,
    deployment_name=deployment_name,
    model_name=model_name,
)
pandas_ai = PandasAI(llm, verbose=True)

query = 'plot horizontal bar chart for entries in "tier" column.'
response = pandas_ai.run(data_frame=uploaded_file_content, prompt=query)

print("Response:", response)
aiakubovich commented 1 year ago

Also in logs of streamlit I am getting: <string>:2: UserWarning: Matplotlib is currently using agg, which is a non-GUI backend, so cannot show the figure.

ihorizons2022 commented 1 year ago

I have the same issue, how to fix it?

gventuri commented 1 year ago

@aiakubovich @ihorizons2022 do you have the same issue if you upload to the latest version?

fabmeyer commented 1 year ago

@gventuri I am also working on a Streamlit app. If I want to plot a graph, it shows the graph correctly next to the input. However I want to access the graph object (I guess it is a matplotlib.pyplot.figure?) and use it elsewhere but it returns None.

answer['answer'] = st.session_state['pandas_ai'](st.session_state['companies_df'], prompt=user_input) print('answer: ', answer['answer']) print('type: ', type(answer['answer'])) returns: answer: None type: <class 'NoneType'>

I am on 0.7.2 and am using the Streamlit middleware.

gventuri commented 1 year ago

Ok I got the point. Unfortunately there's no way to access to the plot obj as of now. What is the use case that you have in mind? I guess something like that could be done with middlewares

fabmeyer commented 1 year ago

@gventuri I would like to place the plot at another place. At the moment the plot is shown below where the prompt was run.

headhuanglan commented 1 year ago

same no plot issue, not working for streamlit with StreamlitMiddleware()

gventuri commented 1 year ago

@fabmeyer that makes sense. We will work on a way to figure out how to make it display on the specific location you want.

@headhuanglan could you be more specific? Can you share the code? It should work with StreamlitMiddleware(), just maybe it doesn't allow you to render where you prefer. Let me know!

fabmeyer commented 1 year ago

@gventuri also the plotting only seems to work (as described above) when running the Streamlit app locally. When deployed on Streamlit-Cloud it doesn't work anymore. Maybe you could look into that as well?

ihorizons2022 commented 1 year ago

nope, still not work using the latest version

image

i use code as following:

llm = OpenAI(model="gpt-4") pandas_ai = PandasAI(llm, enable_cache=False, verbose=True, middlewares=[StreamlitMiddleware()])

RoyKulik commented 1 year ago

If you use plt.gcf() the figure is being closed in _format_results

  def _format_results(self, result: dict):
      if result is None:
          return

      if result["type"] == "dataframe":
          from ..smart_dataframe import SmartDataframe

          df = result["value"]
          if self.engine == "polars":
              if polars_imported:
                  import polars as pl

                  df = pl.from_pandas(df)

          return SmartDataframe(
              df,
              config=self._config.__dict__,
              logger=self._logger,
          )
      elif result["type"] == "plot":
          import matplotlib.pyplot as plt
          import matplotlib.image as mpimg

          # Load the image file
          image = mpimg.imread(result["value"])

          # Display the image
          plt.imshow(image)
          # plt.axis("off")
          # plt.show(block=self._is_running_in_console())
          # plt.close("all")
      else:
          return result["value"]

When I commented those 3 lines above it appears using this code:

res = st.session_state.smart_df.chat(question)
print(type(res))

fig = plt.gcf()
if fig.get_axes():
    st.pyplot(fig, use_container_width=False)

Not the way to solve it but maybe a step for solution.

emonzies commented 1 year ago

Struggling with pandasai 1.+ to plot with Streamlit... and the above bypass by @RoyKulik solve it !

fabmeyer commented 11 months ago

If you use plt.gcf() the figure is being closed in _format_results

  def _format_results(self, result: dict):
      if result is None:
          return

      if result["type"] == "dataframe":
          from ..smart_dataframe import SmartDataframe

          df = result["value"]
          if self.engine == "polars":
              if polars_imported:
                  import polars as pl

                  df = pl.from_pandas(df)

          return SmartDataframe(
              df,
              config=self._config.__dict__,
              logger=self._logger,
          )
      elif result["type"] == "plot":
          import matplotlib.pyplot as plt
          import matplotlib.image as mpimg

          # Load the image file
          image = mpimg.imread(result["value"])

          # Display the image
          plt.imshow(image)
          # plt.axis("off")
          # plt.show(block=self._is_running_in_console())
          # plt.close("all")
      else:
          return result["value"]

When I commented those 3 lines above it appears using this code:

res = st.session_state.smart_df.chat(question)
print(type(res))

fig = plt.gcf()
if fig.get_axes():
    st.pyplot(fig, use_container_width=False)

Not the way to solve it but maybe a step for solution.

I cannot get this to work (with streamlit).

If I'm using your code snippet I am getting the following error:

type:  <class 'NoneType'>
2023-11-06 15:23:38.206 Uncaught app exception
Traceback (most recent call last):
  File "/home/fabmeyer/.local/share/virtualenvs/personal-assistant-Y3o7tQcK/lib/python3.9/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 552, in _run_script
    exec(code, module.__dict__)
  File "/home/fabmeyer/Dev/Python/personal-assistant/personal_assistant.py", line 230, in <module>
    image = mpimg.imread(st.session_state['generated_text'][i]['answer'])
  File "/home/fabmeyer/.local/share/virtualenvs/personal-assistant-Y3o7tQcK/lib/python3.9/site-packages/matplotlib/image.py", line 1525, in imread
    with img_open(fname) as image:
  File "/home/fabmeyer/.local/share/virtualenvs/personal-assistant-Y3o7tQcK/lib/python3.9/site-packages/PIL/ImageFile.py", line 117, in __init__
    self._open()
  File "/home/fabmeyer/.local/share/virtualenvs/personal-assistant-Y3o7tQcK/lib/python3.9/site-packages/PIL/PngImagePlugin.py", line 715, in _open
    if not _accept(self.fp.read(8)):
AttributeError: 'NoneType' object has no attribute 'read'

The type of the ouptut is always None... however the plot is in the directory of my IDE when running locally...