Open ewanlee opened 12 months ago
Hi,
For now it is not trivial to use multiple LLMs. It would need big changes in the distributed architecture (mostly in the server and dispatcher).
There is however a workaround to achieve similar things (i.e. having multiple weights for the same LLM) using Peft adapters. As in some of our examples, you can use Peft to add adapters to the LLM and train only these adapters. But Peft actually allows you to add multiple adapters to a model (https://huggingface.co/docs/peft/developer_guides/mixed_models). Note that you need to carefully set the right adapter every time you use a specific agent (which may not be convenient...).
Thank you very much for the suggestion! I'll try to see if different LoRA adapters can temporarily achieve multi-agent tasks😊😊
Hello! May I ask if it is possible to extend the scenario to multiple agents? For example, a task involves two RL agents, with each agent's policy being an LLM. At each timestep, the two agents need to separately access their respective llm server to make decisions, interact with the environment, collect their own data, and ultimately update their own LLM policies.