joaomdmoura / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
16.61k stars 2.24k forks source link

Agentic cycle execution #397

Open gdagitrep opened 3 months ago

gdagitrep commented 3 months ago

Want to run the crew with a Critic agent in loop. Basically creating two different agents, one prompted to generate good outputs and the other prompted to give constructive criticism of the first agent's output. Can't make it possible. Any pointers?

xitex commented 3 months ago

second task must include 'context' that is output of first task.

gdagitrep commented 3 months ago

That I got, but who takes the output of the second task? Seems like it needs to be a new third task. Also, we want to break this loop conditionally, so will need to create this linking on the fly, but seems to be not possible as Crew is created only once, unless branching is possible with Crew. Eg.

first_producing_task = Task(...)
first_critic_task = Task(...,context=[first_producingtask])
second_producing_task= Task(..., context=[first_critic_task])

want to do this in a loop, until critic agent agrees

Tavernari commented 3 months ago

@gdagitrep, if I understand well, you are asking how to make a kind of Quality & Assurance (QA) to say if the task should pass and maybe restart the process again.

xitex commented 3 months ago

Aha, got it, i think yes it can be achieved with third task attached to the first agent. Try with memory=True Perhaps this would obviate the need to pass the content of the first and the context of the second in the last Task (not sure)

gdagitrep commented 3 months ago

@Tavernari Sort of yes, taking inspiration from : https://arxiv.org/abs/2303.17651

gdagitrep commented 3 months ago

@xitex What about the 4th task which needs to review work done by third?

Tavernari commented 3 months ago

Could you provide a code snippet example to understand how tasks 3 or 4 will help it? I am really curious to understand it.

gdagitrep commented 3 months ago

So the idea is to give opportunity to LLM to iterate on its work, and not just give it a single shot to produce result. Kind of derived from RLHF, but instead of human feedback, we use another agent to provide feedback. Take this example:

Math_agent: Solves math problems, and produces an output. If a feedback is provided, it considers it to improve on the answer.
Reviewer_agent: Reviews the outputs of math problems, and provides feedback if mistakes are made. If no mistakes, output SUCCESS

Now create tasks for these two agents; give the first one a math problem, and the second task will have the problem as well as the output produced by the Math agent, and run them in loop, only breaking out when Reviewer_agent produced SUCCESS.

xitex commented 3 months ago

I think in this case you should use Process.hierarchical so you can create manager agent who gives a task to relevant agent. Another consensual process is just in the planning stage, so for now you can have a try with hierarchical. You can find examples at https://github.com/joaomdmoura/crewAI-examples

Tavernari commented 3 months ago

Related to this manager my issue is in this scenario.

I have these Agents:

Tasks:

  1. get content from source
    • agent: manager
  2. write a text based on the source
    • context: 1
    • outputfile: some path
    • agent: writer
  3. translate the content to some language
    • context: 2
    • outputfile: some path
    • agent: translator
  4. review the final content
    • context: 1, 2, 3
    • outputfile: some path
    • agent: manager

How to make the task 4. Force execute task 2 if the content is wrong or missing important detail?

xitex commented 3 months ago

https://docs.crewai.com/how-to/Hierarchical/#hierarchical-process-overview Here in example you can see that manager assign agent to task so your don't need to assign (i don't see code so must say that). To evaluate result manager should know what important detail he must to check so try include this to the task or try to generate some plan (plan task)

Tavernari commented 2 months ago

Thanks for the return. Even with GPT-4, the manager was not reliable.

For example, in one of the tasks to generate some content, I put on the expected goal the instruction to restart the task if the result doesn't match the requirements. By the way, it did not respect this rule. The manager wrote that it was wrong and accepted the wrong content instead of restarting the task.

Am I doing something wrong? Should the validation live in another task?

jp-gorman commented 2 months ago

There is little doumentation on the functionality of the manager. Offloading planning to the user would be ok if you incorporated some sort of ability to make training the manager LLM in concepts like https://github.com/allenai/lumos to know the planning break down (is this customizable by the use somehow to their trained planning/action combinations. It would be great to have some integration with a training process for planning / actions if developer is responsible for the LLM manager training. Training of the manager would be critical to tools available. Perhaps ICL of the Manager as mentioned by someone before may implicitly finetune the manager LLM if we had access to it to disclose the tools we want available to agents etc.

Based on the docs it would appear the hierarchy is very flat given the manager evaluates all returned outputs (assuming based on task "expected_output"), so no real hierarchy is feasible (my understanding is the manager it top and all other agents are one level below) unlike say unlike the autogen state transition graphs that can be scripted to whatever state transitions you want and however deep you need and if iteration is allowed). As noted in another blog I read if we are letting the LLM do the planning and allocation, perhaps should we allow the LLM to do the agent creation also (as per autogen (https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/agent_builder/).

Perhspa there are features I am missing that are not documented yet?

micuentadecasa commented 2 months ago

There is little doumentation on the functionality of the manager. Offloading planning to the user would be ok if you incorporated some sort of ability to make training the manager LLM in concepts like https://github.com/allenai/lumos to know the planning break down (is this customizable by the use somehow to their trained planning/action combinations. It would be great to have some integration with a training process for planning / actions if developer is responsible for the LLM manager training. Training of the manager would be critical to tools available. Perhaps ICL of the Manager as mentioned by someone before may implicitly finetune the manager LLM if we had access to it to disclose the tools we want available to agents etc.

Based on the docs it would appear the hierarchy is very flat given the manager evaluates all returned outputs (assuming based on task "expected_output"), so no real hierarchy is feasible (my understanding is the manager it top and all other agents are one level below) unlike say unlike the autogen state transition graphs that can be scripted to whatever state transitions you want and however deep you need and if iteration is allowed). As noted in another blog I read if we are letting the LLM do the planning and allocation, perhaps should we allow the LLM to do the agent creation also (as per autogen (https://microsoft.github.io/autogen/docs/reference/agentchat/contrib/agent_builder/).

Perhspa there are features I am missing that are not documented yet?

I have the same impression