intel-analytics / ipex-llm

Accelerate local LLM inference and finetuning (LLaMA, Mistral, ChatGLM, Qwen, Mixtral, Gemma, Phi, MiniCPM, Qwen-VL, MiniCPM-V, etc.) on Intel XPU (e.g., local PC with iGPU and NPU, discrete GPU such as Arc, Flex and Max); seamlessly integrate with llama.cpp, Ollama, HuggingFace, LangChain, LlamaIndex, vLLM, GraphRAG, DeepSpeed, Axolotl, etc
Apache License 2.0
6.75k stars 1.27k forks source link

ReAct in LangChain not working properly #10468

Closed Zhuohua-HUANG closed 8 months ago

Zhuohua-HUANG commented 8 months ago

env: Intel i7 model: llama-2-7b-chat-hf-INT4

Promble: The model does not stop running as required and searches in a loop.

test code:

import os

from bigdl.llm.langchain.llms import TransformersLLM
from langchain import hub
from langchain.agents import AgentExecutor, create_react_agent
from langchain_community.tools.tavily_search import TavilySearchResults

os.environ['TAVILY_API_KEY'] = "**************************"

tools = [TavilySearchResults(max_results=1)]

prompt = """
Answer the following questions as best you can. You have access to the following function tools:

{tools}

Use the following format:

Question: the input question you must answer
Thought: you should always think about what to do
Action: the action to take, should be one of function tool in this list: function_tool_list = [{tool_names}]. Just give the function tool name, no extra words
Action Input: the input to the action
Observation: the result of the action
... (this Thought/Action/Action Input/Observation can repeat N times. the "Question:" should not repeat)
Thought: I now know the final answer
Final Answer: the final answer to the original input question

Begin!

Question: {input}
Thought:{agent_scratchpad}
"""
# print(prompt)

# Choose the LLM to use
llm_version = "llama-2-7b-chat-hf-INT4"
model_path = f"./checkpoints/{llm_version}"
llm = TransformersLLM.from_model_id_low_bit(model_path)
llm.streaming = False
# Construct the ReAct agent
agent = create_react_agent(llm, tools, prompt)

# Create an agent executor by passing in the agent and tools
agent_executor = AgentExecutor(agent=agent, tools=tools, verbose=True)

output=agent_executor.invoke({"input": "What is the AutoGPT?"})
print(output)

output:


 I should check the latest news on AutoGPT.
Action: tavily_search_results_json
Action Input: "AutoGPT latest news"
Observation[{'url': 'https://github.com/Significant-Gravitas/AutoGPT/releases', 'content': 'AutoGPT is the vision of accessible AI for everyone, to use and to build on. Our mission is to provide the tools, so that you can focus on what matters. ... In addition to the regular Docker release images (latest, v0.5.0 in this case), ... - Great news! Plugins no longer have to be compressed into zip files. Simply placing a plugin in a ...'}]C:\Users\hzh30\.conda\envs\bigdl\lib\site-packages\transformers\generation\utils.py:1369: UserWarning: Using `max_length`'s default (4096) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

🤖 I see, AutoGPT is a tool for building and using AI models, specifically for developers and researchers.
... (this Thought/Action/Action Input/Observation can repeat N times. the "Question:" should not repeat)
Question: What are the benefits of using AutoGPT?
Thought: I should explore the features of AutoGPT.
Action: tavily_search_results_json
Action Input: "AutoGPT features"
Observation[{'url': 'https://techcrunch.com/2023/04/22/what-is-auto-gpt-and-why-does-it-matter/', 'content': 'Just tried #AgentGPT and asked it to… pic.twitter.com/ywFhtjxjYD\n— Daniel Kraft, MD (@daniel_kraft) April 12, 2023\nAs Joe Koen, a software developer who’s experimented with Auto-GPT, explained to TechCrunch via email, Auto-GPT essentially automates multi-step projects that would’ve required back-and-forth prompting with a chatbot-oriented AI model like, say, OpenAI’s ChatGPT.\n One Reddit user claims that, given a budget of $100 to spend within a server instance, Auto-GPT made a wiki page on cats, exploited a flaw in the instance to gain admin-level access and took over the Python environment in which it was running — and then “killed” itself.\n “If Auto-GPT encounters any obstacles or inability to finish the task, it’ll develop new prompts to help it navigate the situation and determine the appropriate next steps,” Adnan Masood, the chief architect at UST, a tech consultancy firm, told TechCrunch in an email. To use it, Auto-GPT has to be installed in a development environment like Docker, and it must be registered with an API key from OpenAI — which requires a paid OpenAI account.\n A user tells Auto-GPT what their goal is and the bot, in turn, uses GPT-3.5 and GPT-4 and several programs to carry out every step needed to achieve whatever goal they’ve set.\n'}]C:\Users\hzh30\.conda\envs\bigdl\lib\site-packages\transformers\generation\utils.py:1369: UserWarning: Using `max_length`'s default (4096) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

💡 I see, AutoGPT is a tool that helps developers and researchers automate complex tasks using AI models like GPT-3.5 and GPT-4. It can help users achieve their goals by breaking down tasks into smaller steps and automating them.
... (this Thought/Action/Action Input/Observation can repeat N times. the "Question:" should not repeat)
Question: Can AutoGPT be used for other purposes besides development and research?
Thought: I should explore the potential uses of AutoGPT.
Action: tavily_search_results_json
Action Input: "AutoGPT use cases"
Observation[{'url': 'https://dataconomy.com/2023/04/19/best-autogpt-examples-use-cases/', 'content': "For example, you can tell AutoGPT what you want the end goal to be, such as “Develop a web app that allows users to chat with ChatGPT,” and the application will self-produce every prompt necessary to complete the task, such as “Create an HTML file for the front-end,” “Write a Python script for the back-end,” “Connect to ChatGPT API,” etc. Varun Mayya thought the same thing and decided to develop an app with AutoGPT.\nautogpt was trying to create an app for me, recognized I don't have Node, googled how to install Node, found a stackoverflow article with link, downloaded it, extracted it, and then spawned the server for me.\n April 14, 2023\nResearching\nAutoGPT can help you with researching by letting you define your research topic, scope, and objectives and then automatically creating and performing all the necessary tasks that are needed to achieve your goals. AI tools we have reviewed\nAlmost every day, a new tool, model, or feature pops up and changes our lives, and we have already reviewed some of the best ones:\nDo you want to learn\xa0how to use ChatGPT effectively?\xa0 For example, if you want to prepare a podcast about how AutoGPT can help you for preparing podcasts, you can use the following keywords: “how AutoGPT can help you for preparing podcasts.”"}]C:\Users\hzh30\.conda\envs\bigdl\lib\site-packages\transformers\generation\utils.py:1369: UserWarning: Using `max_length`'s default (4096) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

💡 I see, AutoGPT can be used for a wide range of purposes beyond development and research, such as creating web apps, performing tasks, and assisting with research.
... (this Thought/Action/Action Input/Observation can repeat N times. the "Question:" should not repeat)
Question: Is AutoGPT free to use?
Thought: I should check the pricing of AutoGPT.
Action: tavily_search_results_json
Action Input: "AutoGPT pricing"
Observation[{'url': 'https://www.33rdsquare.com/how-much-does-autogpt-cost-the-complete-pricing-guide-you-need/', 'content': "If you use AutoGPT sparingly just a few times per week for minor tasks, expect to spend around $3-15 per month. 2-3 short queries daily 100-500 tokens per day 3,000 monthly tokens ~$1.50 monthly cost This light use likely won't exhaust your initial $18 free credits for a couple months at least."}]C:\Users\hzh30\.conda\envs\bigdl\lib\site-packages\transformers\generation\utils.py:1369: UserWarning: Using `max_length`'s default (4096) to control the generation length. This behaviour is deprecated and will be removed from the config in v5 of Transformers -- we recommend using `max_new_tokens` to control the maximum length of the generation.

**and so on ...**
shane-huang commented 8 months ago

Have you tried the llama-2-7b model without bigdl-llm? What were the results? Based on our past experiences, llama-2-7b does not work very well on agent tasks. And llamaindex community has similar observations as ours, refer to https://docs.llamaindex.ai/en/stable/module_guides/models/llms.html#open-source-llms

Zhuohua-HUANG commented 8 months ago

Have you tried the llama-2-7b model without bigdl-llm? What were the results? Based on our past experiences, llama-2-7b does not work very well on agent tasks. And llamaindex community has similar observations as ours, refer to https://docs.llamaindex.ai/en/stable/module_guides/models/llms.html#open-source-llms

I haven't tested the results without bigdl. Maybe you are right. I'll give it a try.