crewAIInc / crewAI

Framework for orchestrating role-playing, autonomous AI agents. By fostering collaborative intelligence, CrewAI empowers agents to work together seamlessly, tackling complex tasks.
https://crewai.com
MIT License
20.15k stars 2.78k forks source link

[BUG] Error with alternating messages structure in instruct mode #1454

Open arthursn opened 1 day ago

arthursn commented 1 day ago

Description

Issue

I encounter the following error when using models in instruct mode (served in Databricks):

{"error_code":"BAD_REQUEST","message":"Bad request: Chat message input roles must alternate (user -> assistant -> u -> a -> ...) with an optional system at the start. Tool messages are optional and must follow a preceding assistant message containing tool calls.\\n"}

This error has been reproduced with databricks/databricks-meta-llama-3-1-70b-instruct, meta-llama-3-1-8b-instruct and dbrx-instruct.

Steps to Reproduce

  1. Setup an agent using Llama 3.1 in instruct mode served in [Azure] Databricks
  2. Run crew

Expected behavior

Not to have the error triggered when using LLMs that require alternating message structure.

Screenshots/Code snippets

NA

Operating System

Ubuntu 22.04

Python Version

3.11

crewAI Version

0.70.1

crewAI Tools Version

0.12.1

Virtual Environment

Poetry

Evidence

image

In the example above the AISteelScientist class has a constructor that accepts a LLM:

@CrewBase
class AISteelScientistCrew:
    def __init__(self, llm: Optional[LLM] = None):
        self.llm = llm or LLMInstruct(...)

    @agent
    def researcher(self) -> Agent:
        return Agent(llm=self.llm, ...)

    ...

Possible Solution

I have implemented a custom LLMInstruct class "handles large language models (LLMs) in instruct mode, ensuring messages alternate between 'user' and 'assistant' roles."

class LLMInstruct(LLM):
    @staticmethod
    def ensure_alternating_roles(
        messages: List[Dict[str, str]],
        filler_messages: Dict[str, str] = {
            "user": "",
            "assistant": "",
        },
    ) -> List[Dict[str, str]]:
        fixed_messages = []
        allowed_roles = ["system", "user", "assistant"]
        previous_role = None
        for msg in messages:
            role = msg["role"]
            assert role in allowed_roles, f"Role {role} not allowed"
            if role == previous_role:
                filler_role = "assistant" if role == "user" else "user"
                fixed_messages.append(
                    {
                        "role": filler_role,
                        "content": filler_messages[filler_role],
                    }
                )
            fixed_messages.append(msg)
            previous_role = msg["role"]
        return fixed_messages

    def call(self, messages: List[Dict[str, str]], callbacks: List[Any] = []) -> str:
        return super().call(
            messages=LLMInstruct.ensure_alternating_roles(messages),
            callbacks=callbacks,
        )

Usage:

from ai_steel_scientist.crew import AISteelScientistCrew
from ai_steel_scientist.llm import LLMInstruct

llm = LLMInstruct(model="databricks/databricks-meta-llama-3-1-70b-instruct")
result = (
    AISteelScientistCrew(llm=llm)
    .crew()
    .kickoff(inputs={"topic": "modelling of austenite flow curves"})
)

Additional context

litellm has a user_continue_message feature that seems to partially address this issue: litellm documentation.

arthursn commented 1 day ago

Raised issue in litellm: https://github.com/BerriAI/litellm/issues/6257