crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
21.38k stars 2.97k forks source link

yaml config does not resolve context properly #527

Closed antonkulaga closed 3 months ago

antonkulaga commented 6 months ago

I see that you are trying to make YAML configuration a default now; unfortunately, right now, it is extremely buggy. For example, referencing another task by name in context does not work. I get:

context.0
  Input should be a valid dictionary or instance of Task [type=model_type, input_value='research_topic_task', input_type=str]
    For further information visit https://errors.pydantic.dev/2.7/v/model_type

The yaml I wrote was:

research_topic_task:
  description:
    "Use tools to provide all necessary information to address the question"
  expected_output:
    "The ANSWER that has all relevant information with sources to address the research question"

review_answer_task:
  description:
    "Evaluate the ANSWER according to the following requirements: 
    (1) correctness, 
    (2) usefulness and comprehensiveness, 
    (3) human interpretability
    (4) consideration of causality
    (5) consideration of toxicity and holistic/interdisciplinary evidence
    (6) consideration of standardized ways of analysis and reporting
    (7) longitudinal data 
    (8) consideration of known_aging_biology. 
    Please consider the 8 requirements separately and score the ANSWER on each of them as bad, moderate or good.
    Then make a general evaluation"
  expected_output:
    "JSON format, where each requirement evaluation (and also general evaluation) must have score, pros and cons fields. 
    Use underscores instead of spaces in the field names. Generate no text other than JSON content in the answer. 
    Avoid too many words in json field names.
    For example:
    {{
      'requirement_name': {{
        'score': 'score value',
        'comment': 'additional comment if needed',
        'pros': 'what was good',
        'cons': 'what was bad'
      }} 
      'general_evaluation': {{
        'score': 'score value',
        'comment': 'additional comment if needed',
        'pros': 'what was good',
        'cons': 'what was bad'
      }}
    }}"
  context:
    - "research_topic_task"

Here you can clearly see that referencing task by name does not work and makes most of the complex use-cases for yaml config useless. Another problem is that you do not properly deal with brackets from yaml (I had to double them to escape) but I will make another issue on that

cblokland90 commented 6 months ago

Yes, its a mess, especially because context requires you to call a function which in turn returns a new instance so the context is actually empty. I copied the existing annotations and refactored into this:

!! disclaimer !! I am not a python expert and this code is highly opinionated and works for my case, it has some shortcuts (like you can only have a prop llm on your crew class and otherwise it fails but hee, for me it works :)) Also I think if you put recursion in here it will end up in an infite loop while assembling the agent and task instances. just dont make circular references in your context definition.

import os

import yaml
from crewai import Agent, Task

def CrewBase(cls):
    class WrappedClass(cls):
        is_crew_class = True

        base_directory = None
        task_instances = {}
        agent_instances = {}

        agent_config_by_func = {}
        task_config_by_func = {}
        task_order = []

        def __init__(self, *args, **kwargs):
            super().__init__(*args, **kwargs)

            self.load_agent_config()
            self.load_task_config()

            # init all agent instances
            for agent_name in self.agent_config_by_func.keys():
                # call the function that matches the agent name
                self.agent_instances[agent_name] = self.create_agent(agent_name, self.agent_config_by_func[agent_name])

            # init all task instances
            for task_name in self.task_config_by_func.keys():
                # call the function that matches the task name
                task_config = self.task_config_by_func[task_name]
                self.task_order.append(task_name)
                self.task_instances[task_name] = self.create_task(task_name, task_config)

        def load_agent_config(self) -> None:
            original_agents_config_path = getattr(cls, "agents_config", "config/agents.yaml")
            config = self.load_yaml(os.path.join(self.get_crew_dir(), original_agents_config_path))

            if not config:
                return

            # store all configurations by key name in agent_config_by_func
            for key, value in config.items():
                self.agent_config_by_func[key] = value

        def load_task_config(self) -> None:
            original_tasks_config_path = getattr(cls, "tasks_config", "config/tasks.yaml")
            config = self.load_yaml(os.path.join(self.get_crew_dir(), original_tasks_config_path))

            # store all configurations by key name in task_config_by_func
            for key, value in config.items():
                self.task_config_by_func[key] = value

        def get_crew_dir(self) -> str:
            class_module = self.__class__.__module__
            module = __import__(class_module)
            module_path = os.path.abspath(module.__file__)
            directory_path = os.path.dirname(module_path)
            return directory_path

        def create_agent(self, name, config):

            allow_delegation = config.get('allow_delegation', False) == "true"
            tools = config.get('tools', [])
            tool_instances = []
            for tool in tools:
                tool_instances.append(getattr(self, tool)())
            max_iterations = config.get('max_iterations', 15)

            # remove all keys that are not needed for the agent
            config.pop("allow_delegation", None)
            config.pop("tools", None)
            config.pop("max_iterations", None)

            return Agent(
                name=name,
                config=config,
                llm=self.llm,
                allow_delegation=allow_delegation,
                max_iterations=max_iterations,
                tools=tool_instances,
            )

        def create_task(self, name, config):
            agent_instance = self.create_agent_from_config(config, name)

            # convert the list of context to task instances if context is an array
            if 'context' in config:
                if isinstance(config['context'], list):
                    config['context'] = [self.get_task_lazy(task_name) for task_name in config['context']]

                if isinstance(config['context'], str):
                    config['context'] = self.get_task_lazy(config['context'])

            # remove agent from the task config dict
            config.pop('agent')

            return Task(agent=agent_instance, name=name, config=config, llm=self.llm)

        def create_agent_from_config(self, config, name):
            agent = config['agent']
            # check if agent is a string, if its an object, we assume it agent config compatible with create_agent()
            if isinstance(agent, str):
                # check if agent exists
                if agent not in self.agent_instances:
                    raise Exception(f"Agent {agent} does not exist")
                agent_instance = self.agent_instances[agent]
            elif isinstance(agent, dict):
                agent_instance = self.create_agent(f"{name}_agent", agent)
            else:
                raise Exception("Agent must be a string or a dict")

            self.agent_instances[name] = agent_instance
            return agent_instance

        def get_task_lazy(self, name):
            if name not in self.task_instances:
                self.task_instances[name] = self.create_task(name, self.task_config_by_func[name])

            return self.task_instances[name]

        @staticmethod
        def load_yaml(config_path: str):
            with open(config_path, "r") as file:
                return yaml.safe_load(file)

    return WrappedClass

def crew(func):
    def wrapper(self, *args, **kwargs):

        # get task instances by order
        tasks = [self.task_instances[task_name] for task_name in self.task_order]
        agents = self.agent_instances.values()

        return func(self, agents, tasks, *args, **kwargs)

    return wrapper

And the crew class:

@CrewBase
class MyCrew:
    agents_config = f"{dirname(__file__)}/config/agents.yaml"
    tasks_config = f"{dirname(__file__)}/config/tasks.yaml"

    def __init__(self):
        self.llm = ChatOpenAI(temperature=0.7, model_name="gpt-4-turbo")

    @crew
    def crew(self, agents, tasks) -> Crew:
        return Crew(
            agents=agents,
            tasks=tasks,
            llm=self.llm,
            process=Process.sequential,
            memory=True,
            embedder={
                "provider": "gpt4all"
            },
            verbose=2
        )

    def search_tool(self):
        return SerperDevTool()

    def web_rag_tool(self):
        return WebsiteSearchTool()

Now you can define agents.yaml to define agents that dont neccessarily have a task but can be used for delegation.
Also you can define your tasks and inline the agent config there which makes it much more organized (imo)

# agents.yaml
my_task:
  agent: # object to create or string to reference one in agents.yaml
    role: ...
    tools: [ search_tool, web_rag_tool ] # must match with function name in your crew class
    goal: |
      ...
    backstory: |
      ...
  context: [ ... ] # list of strings referencing a key in tasks.yaml
  description: |
    ...
  expected_output: |
    ...
github-actions[bot] commented 3 months ago

This issue is stale because it has been open for 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 3 months ago

This issue was closed because it has been stalled for 5 days with no activity.