crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
18.96k stars 2.61k forks source link

Multiple LLMs will not trigger multiple agent execution calls and infinitely loop #983

Open Nolanjue opened 1 month ago

Nolanjue commented 1 month ago

I'm not sure if this is stated in the docs, but currently running one agent with a specific model that has a tool which calls another agent(with a different model) will fail to do so and ends up in an infinite loop.

Also, does crewai Verbose thoughts and actions limited to the complexity of the model? certain models like phi3 are not able to even use tools(tested in LM studios).

ex: (i have not included the text inside each text bracket)

"agent using the function"

def new_task(agent):
        return Task(description=f"""
            """,
            agent=agent, 
            expected_output=""". """)

agent= Agent(role="",
            goal="""
                 """,
            backstory="""
                 """,
            tools=[Specific.choose_specific_action,],
            llm=ollama_llm,
            verbose = True,

)
upper_management = Crew(agents = [agent], tasks=[new_task(agent)],)

workflow = upper_management.kickoff()

print("final answer", workflow)

"agent inside the function(using openai LLM)"


class Specific():
    @tool("choose agent to generate a response")
    def choose_specific_action(query: str):
         """ """
         specific_agent = Agent(
          role='info agent',
          goal=
          """ """,
          backstory=
          """
            """,

          allow_delegation=False,
          tools = [RAGsearch],
         #llm = stable_code_llm,
         verbose = True,
          )

         task = Task(
               agent=easy_agent,
               description=
               f""" """,
               expected_output =""
            )
         summary = task.execute()
         print("agent response",summary)
         return summary

Is there a reason for these issues? Would be glad if anyone knew more,

Crewai also seems to infinitely loop with ollama completely. When using this: ollama_llm = Ollama(model = "llama2", base_url = "http://localhost:11434")

Nolanjue commented 1 month ago

There is only one instance that this workflow appears to work, and it is the openai GPT LLMs. any other model I've used seems to not call tools, or worse: infinitely loop. I don't think its a hardware issue(could be), but that's all I've noticed so far.

lorenzejay commented 1 month ago

can you provide some output traces ?

Nolanjue commented 1 month ago

It basically does the crewexecute for the first agent, but then when it attempts to use a tool(which I have provided for them to use), it just stalls and reruns it somehow, again and again, without ever using the tools. for example, you can see how it just redoes the thought again:

>
 Entering new CrewAgentExecutor chain...
Thought: I understand that I need to classify the difficulty of the query "What is this this company " based on the existing tools provided. I will carefully evaluate each tool and choose the one that best matches the query's relevance to the company provided

Action: Use the "easy query" tool.
Action Input: { "query": "What is this company?" }

Action 'Use the "easy query" tool.' don't exist, these are the only available Actions:
 choose the easy difficulty(query: str) - Useful if  the users query seems like an easy task to accomplish or question to answer related to the
  company,

Thought: As the HR manager, it is important to use the available tools to classify the difficulty of the query "What is this company?" based on its relevance to our company. I will carefully evaluate each tool and choose the one that best matches the query's relevance to this comapny.

Action: Use the "easy query" tool.
Action Input: { "query": "What is this company" }

Action 'Use the "easy query" tool.' don't exist, these are the only available Actions:
 choose the easy difficulty(query: str) - Useful if  the users query seems like an easy task to accomplish or question to answer related to the
    company,
goran-ristic-dev commented 1 month ago

@Nolanjue You are having several issues if I understood properly .

  1. calling multiple models As I can see you are running LLM models locally . With ollama there is parameter which defines how many models you can run simultaneously : OLLAMA_MAX_LOADED_MODELS. Type set to see if this parameter is defined and limits max number number of models to be run simultaneously.
  2. Capability to call tools based on selected model

From my own experience - not all models have same reasoning capabilities and that impacts their capability to call different tools. I have used phi3 and ollama3 with various success and failure when calling tools .

What helps is being more instructive in prompt (description), about using tools. For example: """

phi3_ollama = ChatOllama(model="phi3", format='json')

embedder = {"provider": "ollama", "config": {"model": "nomic-embed-text"} }

docs_scrape_tool = ScrapeWebsiteTool( website_url="https://docs.crewai.com/how-to/Creating-a-Crew-and-kick-it-off/",

) scrape_tool = ScrapeWebsiteTool(embedder=embedder)

research_topic = Task(agent=PredefinedAIAgents.content_researcher, description="""Research the topic: {topic}, and collect data from internet . Process:

  1. You will first use your own experience about topic,
  2. Use DuckDuckGoSearch tool to find best sources which contain more info.
  3. Use ScrapeWebsiteTool to read data from url or urls. Don't forget to pass the parameters for tools as dictionary with parameter-value pairs.""", expected_output="Research report with really detailed information on the topic: {topic}", llm=phi3_ollama, tools=[CustomAgentTools.search_duck_duck_go, scrape_tool], verbose=2)

This prompt worked for me. Sometime from the first attempt and sometime after several attempts. Agents have self-correcting process and may eventually succeed. (if max number of iterations is not reached)

Nolanjue commented 1 month ago

@Nolanjue You are having several issues if I understood properly .

  1. calling multiple models As I can see you are running LLM models locally . With ollama there is parameter which defines how many models you can run simultaneously : OLLAMA_MAX_LOADED_MODELS. Type set to see if this parameter is defined and limits max number number of models to be run simultaneously.
  2. Capability to call tools based on selected model

From my own experience - not all models have same reasoning capabilities and that impacts their capability to call different tools. I have used phi3 and ollama3 with various success and failure when calling tools .

What helps is being more instructive in prompt (description), about using tools. For example: """

phi3_ollama = ChatOllama(model="phi3", format='json')

embedder = {"provider": "ollama", "config": {"model": "nomic-embed-text"} }

docs_scrape_tool = ScrapeWebsiteTool( website_url="https://docs.crewai.com/how-to/Creating-a-Crew-and-kick-it-off/",

) scrape_tool = ScrapeWebsiteTool(embedder=embedder)

research_topic = Task(agent=PredefinedAIAgents.content_researcher, description="""Research the topic: {topic}, and collect data from internet . Process: 1. You will first use your own experience about topic, 2. Use DuckDuckGoSearch tool to find best sources which contain more info. 3. Use ScrapeWebsiteTool to read data from url or urls. Don't forget to pass the parameters for tools as dictionary with parameter-value pairs.""", expected_output="Research report with really detailed information on the topic: {topic}", llm=phi3_ollama, tools=[CustomAgentTools.search_duck_duck_go, scrape_tool], verbose=2)

This prompt worked for me. Sometime from the first attempt and sometime after several attempts. Agents have self-correcting process and may eventually succeed. (if max number of iterations is not reached)

1.Very interesting feature, might be more effective than other alternatives.

2.Yes, after researching this issue a bit more thoroughly, it seems certain open source models like from LM studio(very ineffective) might not be the best option, as models like stable code, phi3, llama2 don't seem to have a architecture complex enough to handle tool requests from any agentic RAG. Not sure why, but there is a reason.

llama3 should be able to do so, and is able to run it. Testing on llama3 with llamaIndex agents gets me the issue here: https://github.com/ollama/ollama/issues/3774. Not sure if its a hardware issue(I don't have capability for GPU offloading for my device), but it seems models like llama3 should be able to execute tools. also regarding the self correction process,im thinking its related to the infinite loop.

Kingbadger3d commented 1 month ago

Hi guys, Ive been having the same issues. Im running ollama, 4-5 days ago everything was working in the most part. Then I did a git pull 3 or 4 days back and now 95% of the time the GPU just sits at 100% usage and never returns the result.

(Im using chatgpt 4.0-mini for the agent researcher, then it passes the writer task to ollama local model, this was working with Arccee models, Gwen2.5, internllm2 and a bunch of other recently released models, now it almost never works, or will do 1 document out a list of subjects to reasearch and then the second doc it runs on, gpu sits at 100% and never returns anything).

Ive also built my own versions of Llama 3.1 8B which work with ollama in openwebui but In CrewAi it does 1 doc at best and just infinate loops after that, A fix would be nice as I have a funding round presentation in the middle of August, and ive gone from a working prototype to a Broken prototype.

Nolanjue commented 1 month ago

Hi guys, Ive been having the same issues. Im running ollama, 4-5 days ago everything was working in the most part. Then I did a git pull 3 or 4 days back and now 95% of the time the GPU just sits at 100% usage and never returns the result.

(Im using chatgpt 4.0-mini for the agent researcher, then it passes the writer task to ollama local model, this was working with Arccee models, Gwen2.5, internllm2 and a bunch of other recently released models, now it almost never works, or will do 1 document out a list of subjects to reasearch and then the second doc it runs on, gpu sits at 100% and never returns anything).

Ive also built my own versions of Llama 3.1 8B which work with ollama in openwebui but In CrewAi it does 1 doc at best and just infinate loops after that, A fix would be nice as I have a funding round presentation in the middle of August, and ive gone from a working prototype to a Broken prototype.

you could possibly try with reActAgents from llamaindex testing those LLMs as a quick backup scenario? It probably will get errors assuming crewai gets their main features from llama index, but it dosent hurt to try.