alex-petrenko / sample-factory

High throughput synchronous and asynchronous reinforcement learning
https://samplefactory.dev
MIT License
773 stars 106 forks source link

Non-symmetric action/observation spaces for multiagent environments? #266

Closed MatthewCWeston closed 1 year ago

MatthewCWeston commented 1 year ago

I've been trying to write a wrapper for several popular MARL environments (in particular, the multiagent particle environments from the MADDPG paper) using the custom multiagent environment template, but it seems to only support a single action space for all agents. Have I misread the code, or are only symmetric action spaces supported at present?

alex-petrenko commented 1 year ago

Yeah, this is one thing we currently don't explicitly support, SF expects mostly homogenous agents. In the past, my recommendation was to make a bigger action space and have different policies use subsets of it, and you can do the same for observations.

It would not be that difficult to modify the code to support multiple different obs/action spaces for different policies. I consciously decided not to do it in this version cause the code is already pretty messy around the tensor shapes and data flow. But maybe in future versions :)

Sorry if this is not the answer you expected. Sometimes you have to make compromises between performance, supported features, and development time.

MatthewCWeston commented 1 year ago

No worries. I'll try that approach, check around similar libraries, and maybe take a shot at implementing it myself, if that looks like the cleanest option. If I get somewhere with modifying the codebase, I'll try to submit a PR.