opendilab / DI-engine

OpenDILab Decision AI Engine. The Most Comprehensive Reinforcement Learning Framework B.P.
https://di-engine-docs.readthedocs.io
Apache License 2.0
2.95k stars 361 forks source link

Export multi-agent policies and shared training #822

Closed ardian-selmonaj closed 1 month ago

ardian-selmonaj commented 1 month ago

I have two questions, for which I could not yet find how to do it:

1) how can I extract specific policies out of a trained multi-agent model? e.g. when 3 agents were trained, "agent_1", "agent_2" and "agent_3", I would like to have the prediction only of a specific agent.

2) how to configure MARL training s.th. multiple agents share the same poilcy? E.g. "agent_1" and "agent_2" use the same policy pi_1.

Ray RLlib supports this, but they have much less algorithms than Di-engine, therefore I would highly appreciate to have these functionalities here. Thank you!

PaParaZz1 commented 1 month ago

The default implementation in DI-engine for MARL is to share the same policy for different agents. This scheme is commonly used in multi-agent cooperation task. You can start from the existing algorithms in DI-engine like QMIX/MAPPO. Here is a simple example on the pettingzoo simple spread environment.

After training, if you want to extract/export the learned policy, you can find the PyTorch model state_dict in the saved checkpoint.