langchain-ai / langchain

🦜🔗 Build context-aware reasoning applications
https://python.langchain.com
MIT License
94.45k stars 15.27k forks source link

NameError: Name pd not found with PythonRepl #6166

Closed NeyoxDrago closed 6 months ago

NeyoxDrago commented 1 year ago

System Info

langchain version = 0.0.198

while using the lagchain create_pandas_dataframe_agent, it was able to generate the correct intermediate command, but when it came to execute it, it says pd is not defined. its not able to detect that pandas is to imported as pd

i am using AzureOpenAI service with gpt-3.5-turbo model

anyone who can help me on this

Who can help?

No response

Information

Related Components

Reproduction

Same code run

Expected behavior

in intermediate step, it will say pd is not defined.

vowelparrot commented 1 year ago

A code snippet would help

NeyoxDrago commented 1 year ago

code snippet is same as given in create_pandas_dataframe_agent demo

check this output snapshot, maybe this helps.

error

this is what i get on running the command, and these keeps coming as model with try again and again...

vowelparrot commented 1 year ago

So you're running exactly this notebook https://python.langchain.com/en/latest/modules/agents/toolkits/examples/pandas.html

except the LLM has been changed to use an AzureOpenAI wrapper? It looks like the format of your output is quite different than what would occur in the example notebook. It has markdown syntax , etc.

NeyoxDrago commented 1 year ago

yeah i am running the same code, just changed the model.

about the markdown, that might be because the output i presenting above was run on CMD and that might be the case, not sure about that.

dosubot[bot] commented 1 year ago

Hi, @NeyoxDrago! I'm Dosu, and I'm here to help the LangChain team manage their backlog. I wanted to let you know that we are marking this issue as stale.

Based on my understanding of the issue, you are experiencing a NameError when trying to execute a command using the langchain create_pandas_dataframe_agent function. It seems that the code is not recognizing that pandas should be imported as pd. In the comments, vowelparrot asked for a code snippet to help troubleshoot, and you provided a snapshot of the error message. Vowelparrot then suggested that the output format might be different from the example notebook, and you confirmed that you are running the same code but with a different model.

Before we proceed, we would like to confirm if this issue is still relevant to the latest version of the LangChain repository. If it is, please let us know by commenting on this issue. Otherwise, feel free to close the issue yourself, or the issue will be automatically closed in 7 days.

Thank you for your understanding, and we look forward to hearing from you soon!

Best regards, Dosu

Liunech commented 1 year ago

I have the same issue, and I can reproduce the issue without any API. LangChain version is 0.0.230 The simplest way to reproduce for me by running

from langchain.tools.python.tool import PythonAstREPLTool
tool = PythonAstREPLTool()
tool.run("import pandas as pd\ndata = {'timeframe':30, 'number_transactions':20}\ndef main():\n  df = pd.DataFrame(data,index=[0])\n  return df\nmain()")

I am getting error "NameError: name 'pd' is not defined"

And if I run it this way, it will work

tool.run("import pandas as pd\ndata = {'timeframe':30, 'number_transactions':20}\ndf = pd.DataFrame(data,index=[0])\ndf\n")

The same error I am getting if the code has a function and calls it. The error will be "NameError: name 'my_function_name' is not defined"

Will appreciate the help with this issue

dosubot[bot] commented 1 year ago

@vowelparrot Could you please help @Liunech with this issue? They are experiencing a NameError when trying to execute a command using the langchain create_pandas_dataframe_agent function. They provided a code snippet and mentioned that they can reproduce the issue without any API. They are using LangChain version 0.0.230. Thank you!

netoront commented 11 months ago

I'm having a similar problem using langchain_experimental 0.0.42. The core problem seems to be this exec behavior (from https://docs.python.org/3/library/functions.html#exec):

If exec gets two separate objects as globals and locals, the code will be executed as if it were embedded in a class definition.

Using PythonAstREPLTool to run

d = "0123456789"
c = [d.count(digit) for digit in "02468"]
c

is the same as executing

class _:
    d = "0123456789"
    c = [d.count(digit) for digit in "02468"]

and then evaluating _.c. But because of Python's scoping rules while executing class definitions, the identifier d in the list comprehension doesn't refer to the variable d assigned in the first statement. (If I paste that into my IDE, I get a squiggly line under the d.)

IMO the right fix is to ensure that tool.globals and tool.locals are the same object. We generally want module-like evaluation anyway, not class-like, and in a module top level, globals() is locals() is true. Unfortunately, PythonAstREPLTool is a Pydantic model that validates its fields, which copies them. That is, you can't solve it by passing the same object for both because Pydantic gets in the way:

globs = {}
tool = PythonAstREPLTool(globals=globs, locals=globs)
print(id(globs), id(tool.globals), id(tool.locals))  # These are all different!

But you can hack it in afterward. Here's a complete program in which PythonAstREPLTool behaves as intended:

from langchain_experimental.tools.python.tool import PythonAstREPLTool

query = """
d = "0123456789"
c = [d.count(digit) for digit in "02468"]
c
"""

tool = PythonAstREPLTool()
tool.locals = tool.globals  # This line is required!
print(tool.run(query))

Now to work out how to hack this into the Pandas dataframe agent from the outside...

netoront commented 11 months ago

To be clear, I don't mean that this is just funny behavior that devs need to work around. This is absolutely a bug in PythonAstREPLTool.

netoront commented 11 months ago

This terrible hack seems to work around the bug when PythonAstREPLTool is used by the Pandas dataframe agent:

PythonAstREPLTool_init = PythonAstREPLTool.__init__

def PythonAstREPLTool_init_wrapper(self, *args, **kwargs):
    PythonAstREPLTool_init(self, *args, **kwargs)
    self.globals = self.locals

PythonAstREPLTool.__init__ = PythonAstREPLTool_init_wrapper
RichardHruby commented 10 months ago

I am facing the same issue. @netoront could you please detail how you solved it? Are you passing this as an extra tool to the Pandas dataframe agent, or are you modifying the pandas_agent.tools[0] directly?

netoront commented 10 months ago

I am facing the same issue. @netoront could you please detail how you solved it? Are you passing this as an extra tool to the Pandas dataframe agent, or are you modifying the pandas_agent.tools[0] directly?

I'm modifying PythonAstREPLTool.__init__, as in my previous comment. My program runs that code on startup.