Closed sundar7D0 closed 1 year ago
+1
+1
I think this might actually, I think this is a general problem with gpt-3.5-turbo
if instead you try gpt-4
it will work.
gpt-3.5-turbo
adds backticks in the action input which causes the syntax error. I'm trying to fix this behavior but I haven't managed to yet
Based on my understanding this affects more than just create_dataframe_agent
and could affect agent/tool using PythonREPL
and not "sanitizing" input to it. It seems that sanitization is left up to each caller of PythonREPL
and such "sanitization" logic is duplicated in at least a few places.
LLMMathChain
does sanitize/strip backticks from the input. So in theory this should work fine with gpt-3.5-turbo
/ChatOpenAI
PythonAstREPLTool
does not appear to sanitize the input. This is what is used by create_pandas_dataframe_agent
and is likely the source of the issue here.PythonREPLTool
, which is defined right above PythonAstREPLTool
, does sanitze/strip out the triple backticks.(Note main motivation for wanting to use gpt-3.5-turbo
over text-davinci-003
is the much lower price at this time)
However even for LLMMathChain
, sometimes it still has issues with sanitizing. Here is one run I've seen fail, though it strangely doesn't always fail: (Using ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo')
) (Note the answer would have been wrong anyways...)
agent.run("Calculate the population of the USA if the entire country had a population density equivalent to Wyoming")
> Entering new AgentExecutor chain...
I should use a calculator for this question
Action: Calculator
Action Input: Wyoming population (578,759) and USA land area (3,796,742 square miles)
Observation: Answer: 7.621784677494547
Thought:That's the population density of Wyoming, not the total population of the USA. I should use a search engine to find the total population of Wyoming.
Action: Search
Action Input: "Wyoming population"
Observation: Wyoming is a state in the Mountain West subregion of the Western United States. It is bordered by Montana to the north and northwest, South Dakota and Nebraska to the east, Idaho to the west, Utah to the southwest, and Colorado to the south.
Thought:Oops, I need to refine my search query to find the population
Action: Search
Action Input: "population of wyoming"
Observation: Wyoming is a state in the Mountain West subregion of the Western United States. It is bordered by Montana to the north and northwest, South Dakota and Nebraska to the east, Idaho to the west, Utah to the southwest, and Colorado to the south.
Thought:That didn't work. I should try a different search query.
Action: Search
Action Input: "current population of wyoming"
Observation: Wyoming is a state in the Mountain West subregion of the Western United States. It is bordered by Montana to the north and northwest, South Dakota and Nebraska to the east, Idaho to the west, Utah to the southwest, and Colorado to the south.
Thought:This is strange, the search engine isn't giving me the information I need. Let me try a different one.
Action: Search
Action Input: "population of wyoming 2021"
Observation: Wyoming is a state in the Mountain West subregion of the Western United States. It is bordered by Montana to the north and northwest, South Dakota and Nebraska to the east, Idaho to the west, Utah to the southwest, and Colorado to the south.
Thought:That still didn't work. Maybe I should try a different source.
Action: Search
Action Input: "wyoming population census bureau"
Observation: Population (up 7.4% to 331.4 million). Race and ethnicity (White alone 61.6%; Black alone 12.4%; Hispanic 18.7%; Asian alone 6%; American Indian ...
Thought:Ok, now I have the population of Wyoming and the population density equivalent to Wyoming. I can use the calculator to find the total US population.
Action: Calculator
Action Input: Wyoming population (578,759) and population density equivalent (5.840 people per square mile) and USA land area (3,796,742 square miles)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
[<ipython-input-4-8b759719f91c>](https://localhost:8080/#) in <cell line: 4>()
2 # Also sometimes outputs the WRONG answer; one time it just gave the incorrect answer of the USA population!
3 # Correct answer should be: ~22,212,450
----> 4 agent.run("Calculate the population of the USA if the entire country had a population density equivalent to Wyoming")
12 frames
[/usr/local/lib/python3.9/dist-packages/langchain/chains/llm_math/base.py](https://localhost:8080/#) in _process_llm_result(self, t)
66 answer = "Answer: " + t.split("Answer:")[-1]
67 else:
---> 68 raise ValueError(f"unknown format from LLM: {t}")
69 return {self.output_key: answer}
70
ValueError: unknown format from LLM: What is Wyoming's land area in square miles?
```python
wyoming_pop_density = 5.84
wyoming_pop = 578759
wyoming_area = wyoming_pop / wyoming_pop_density
print(wyoming_area)
(NOTE: I had to indent the last few lines by 4 spaces to get the backticks to render properly, there was no indentation on the actual output)
@sundar7D0 , does this issue got resolved? Chat model working with pandas agent?
I think with some fixes as suggested here and in other issues to resolve the parsing bottleneck, now chat model works with pandas agent!
issues to resolve the parsing bottleneck
Hey @sundar7D0 ! Any chance you can point out the other issues? 🙏
Please check #1358 thread
I am still experiencing this on v0.0.163. I don't have access to gpt4, unfortunately. I don't see a manual workaround here in the thread. Is the fix supposed to be in the repo already?
I solved this problem by Using AgentType.OPENAI_FUNCTIONS:
create_pandas_dataframe_agent(ChatOpenAI(temperature=0, model_name='gpt-3.5-turbo'), df, verbose=True, agent_type=AgentType.OPENAI_FUNCTIONS)
I tried creating a pandas dataframe agent (using create_dataframe_agent) with ChatOpenAI from promptlayer or otherwise just plain version (ChatOpenAI) as the LLM! But langchain isn't able to parse the LLM's output code. If I modify some regular expression manually, it works (but again fails if the code is a single line, etc.). Basically the regex and parsing logic for figuring out Action, Action_input for chat models needs to be changed. This probably means lot of other existing chains with ChatOpenAI as LLM are up for a toss!
Code:
Output: