Enable Usage of Different LLMs or SLMs for Task-Specific Agents

rezzie-rich commented 6 months ago

What problem or use case are you trying to solve? Currently, the OpenDevin project utilizes a single AI model to power all agents, which might not fully leverage the specialized capabilities of various Large Language Models (LLMs) or Smaller Language Models (SLMs) tailored for specific tasks. Implementing a more flexible approach where agents can utilize different LLMs or SLMs based on their specific tasks and the models' capabilities could enhance efficiency, accuracy, and the overall performance of the agents.

Describe the UX of the solution you'd like Ideally, developers would have the ability to configure each agent with a preferred AI model directly from the project settings or through an API. This configuration could include selecting the model type, specifying any required parameters for the model, and setting up model-specific optimization techniques. Additionally, the system should provide feedback on the performance implications of selecting different models for specific tasks, assisting developers in making informed decisions.

Do you have thoughts on the technical implementation? The technical implementation could involve extending the current agent configuration schema to include an aiModel field where different LLM or SLM identifiers can be specified. This would require a modular architecture that allows for the easy integration and interchangeability of AI models. Also, implementing a model abstraction layer could help manage the interactions between agents and their respective models, ensuring that changes in one part of the system do not adversely affect others.

Describe alternatives you've considered An alternative could be to enhance the current single AI model to be more adaptable and capable of handling a wider range of tasks effectively. However, this might lead to increased complexity and resource consumption, potentially diminishing the system's performance on tasks that could be more efficiently handled by specialized models.

Additional context Integrating multiple AI models into the system could also open up new possibilities for experimentation and optimization, enabling the community to explore innovative ways of improving task-specific performance. Additionally, this feature aligns with the ethos of the open-source community by promoting flexibility, experimentation, and continuous improvement.

rezzie-rich commented 6 months ago

Think of this feature request in terms of a workplace analogy: if we consider each agent within OpenDevin as different job positions within a company, and the AI models as the employees filling those positions, it becomes clear why this feature is so valuable. Just like in a company, where you wouldn’t expect a single employee to excel in sales, development, accounting, and all other roles simultaneously, it's unrealistic to expect one AI model to be the best fit for every distinct task that our agents are handling. Different tasks require different expertise. By allowing each agent to utilize an AI model that's specialized for its specific tasks, we're essentially hiring the right 'employee' for each 'job position', optimizing our workforce for efficiency and effectiveness. This approach not only enhances the performance of individual agents but also significantly improves the overall productivity and capabilities of our system.

rbren commented 6 months ago

Closing in favor of https://github.com/OpenDevin/OpenDevin/issues/117

rezzie-rich commented 6 months ago

I think this captures a slightly different issue with a better solution.

rbren commented 6 months ago

Hmm I see why you say that.

There's a long-term goal here of creating a "Meta Agent", where several domain-specific agents are used for different tasks (which the AutoDev issue covers). To me, this issue is specifically about configuring the meta agent, and isn't quite as applicable in a single-agent world.

What do you think about adding this as an Acceptance Criterion for the meta agent (aka autodev)?

rezzie-rich commented 6 months ago

This can definitely be a criterion for the meta agents.

The goal is to have different ai models to power different agents rather than a single model powering all the agents. There are some benefits to that. For example, mistral/mixtral could power a generalized agent while deepseek-coder-instruct could power that agents responsible for code generation and white-rabbit-neo could power the agents responsible for testing or cybersecurity. Instead of just using task specific agents, those task specific agents will be powered by task specific ai models as well.

This way, only one model is active at a time, and multiple open-source relatively smaller llm can collaborate to out perform giant llm like gpt-4. The concept is slightly similar to how mixtral 8x7b works.

rezzie-rich commented 6 months ago

This can be an option where users can either choose to use the same ai model to power all the agents or choose different models for different agents, creating a dynamic preference.

rbren commented 6 months ago

Yeah agree--this could hypothetically be very useful.

rezzie-rich commented 6 months ago

If users can choose multiple ai models, SLM like phi-2 can be used for 'api meta agents', making the overall performance more efficient.

Reference to issues #117 comment

All-Hands-AI / OpenHands

Enable Usage of Different LLMs or SLMs for Task-Specific Agents #486