Stable-Baselines-Team / stable-baselines3-contrib

Contrib package for Stable-Baselines3 - Experimental reinforcement learning (RL) code
https://sb3-contrib.readthedocs.io
MIT License
470 stars 173 forks source link

[Feature Request] Better support for action masking for vectorized environments #68

Open BolunDai0216 opened 2 years ago

BolunDai0216 commented 2 years ago

Motivation Stable-baselines3 (SB3) has introduced support for action masking (see here), which is a great feature. However, this API requires the user to provide an ActionMasker wrapper. The issue is that some environments (e.g., gym-microrts, pettingzoo) directly provides a vectorized interface, so there is no chance to apply this ActionMasker.

Feature Extending MaskablePPO to work with vectorized environments natively. With this extension, SB3 + PPO + Mask can work in gym-microrts and pettingzoo.

BolunDai0216 commented 2 years ago

@vwxyzjn

araffin commented 2 years ago

Hello, this is a duplicate of https://github.com/Stable-Baselines-Team/stable-baselines3-contrib/issues/49#issuecomment-957629253

we would apppreciate a PR that solves this issue ;)