flowersteam / lamorel

Lamorel is a Python library designed for RL practitioners eager to use Large Language Models (LLMs).
MIT License
200 stars 18 forks source link

Expand to multi-agent scenarios. #28

Open ewanlee opened 12 months ago

ewanlee commented 12 months ago

Hello! May I ask if it is possible to extend the scenario to multiple agents? For example, a task involves two RL agents, with each agent's policy being an LLM. At each timestep, the two agents need to separately access their respective llm server to make decisions, interact with the environment, collect their own data, and ultimately update their own LLM policies.

ClementRomac commented 11 months ago

Hi,

For now it is not trivial to use multiple LLMs. It would need big changes in the distributed architecture (mostly in the server and dispatcher).

There is however a workaround to achieve similar things (i.e. having multiple weights for the same LLM) using Peft adapters. As in some of our examples, you can use Peft to add adapters to the LLM and train only these adapters. But Peft actually allows you to add multiple adapters to a model (https://huggingface.co/docs/peft/developer_guides/mixed_models). Note that you need to carefully set the right adapter every time you use a specific agent (which may not be convenient...).

ewanlee commented 11 months ago

Thank you very much for the suggestion! I'll try to see if different LoRA adapters can temporarily achieve multi-agent tasks😊😊