The possibility of integrate multi-agent reinforcement learning libraries

Hi @fantastic8124,

It should be possible to write a wrapper to other RL interfaces (we have several examples here: https://github.com/waymo-research/waymax/tree/main/waymax/env/wrappers).

It looks like Marllib uses Ray's MultiAgentEnv interface, which has a sparse representation for agents - actions are passed in as a dict of {agent_key: action} and observations are returned as a dict of {agent_key: obs}. Waymax uses a dense representation by default (we expect and return arrays, with a valid mask), but it should be possible to convert between the two using an adapter.

We'll be busy supporting bugfixes and basic features for the time being with the initial release, but would be open to implementing this in the future or accepting a pull request with this feature.

waymo-research / waymax

The possibility of integrate multi-agent reinforcement learning libraries #3