Open vwxyzjn opened 2 years ago
I have a working version of microrts integrated with PettingZoo to expose each unit in the game as independent agent :) I assume you are describing less "extreme" API where we have agent = player, right?
Is the goal to make wrappers from PettingZoo to SB3 to work (like vectorization)?
Oh @kachayev that's awesome! @BolunDai0216 is interested in working on this. Would you mind sharing your version here?
Absolutely! I'll dig it up tomorrow
Okay, I completely blinked on this. This is (partially) the code I'm using in my experiments. I tried to cherry-pick it without any dependencies on my implementation of the environment. I think that the use case of having API for 2 players would be much easier: there won't be any problems with having dynamic number of agents, obs and action space is the same for both players, no problems with rewards/infos, etc. It will be just a little bit of index juggling when putting obs and actions in place. For having each unit as a separate agent, as you see here, is more involved. And I certainly don't have fully fledged solution that would cover most use cases (this one is tight to my specific algo only). Also, note that this AEC
API. Not sure if the goal here to have only AEC, or other APIs as well. Support for parallel_env
would be cool as well.
from pettingzoo import AECEnv
from pettingzoo.utils import agent_selector
class MicroRTSAEC(AECEnv, MicroRTSGridModeSharedMemVecEnv):
def __init__(
self,
opponent,
agent_vision_patch=(5,5),
partial_obs=False,
max_steps=2000,
render_theme=2,
frame_skip=0,
map_path="maps/10x10/basesTwoWorkers10x10.xml",
reward_weight=np.array([0.0, 1.0, 0.0, 0.0, 0.0, 5.0]),
):
self.agent_vision_patch = agent_vision_patch
super(MicroRTSGridModeSharedMemVecEnv, self).__init__(
0,
1,
partial_obs,
max_steps,
render_theme,
frame_skip,
[opponent],
[map_paths],
reward_weight,
)
self._agent_selector = agent_selector([]) # empty before we start
self.agent_selection = None
self.agent_observation_space = gym.spaces.Box(
low=0.0,
high=1.0,
shape=(self.agent_vision_patch[0], self.agent_vision_patch[1], sum(self.num_planes)),
dtype=np.int32
)
self.agent_action_space = gym.spaces.MultiDiscrete(np.array(self.action_space_dims))
self._reset_actions = np.zeros_like(self.actions)
def observation_space(self, agent):
"""All agents have the same obs space."""
return self.agent_observation_space
def action_space(self, agent):
"""All agents have the same action space."""
return self.agent_action_space
def reset(self):
"""Note that we don't return obs here as we do with Gym."""
super(MicroRTSGridModeSharedMemVecEnv, self).reset()
np.copyto(self.actions, self._reset_actions)
all_agents = self.agents
self._agent_selector.reinit(all_agents)
self.agent_selection = self._agent_selector.next()
self.infos = {agent:{} for agent in all_agents}
self.dones = {agent:False for agent in all_agents}
self._cumulative_rewards = {agent:0. for agent in all_agents}
def step(self, action):
agent = self.agent_selection
# fill in action for a given agent
np.copyto(self.actions[0][agent], action)
if self._agent_selector.is_last():
all_agents = self.agents
obs, rewards, dones, infos = self.step_wait()
self.infos = {agent:infos[0].copy() for agent in all_agents}
self.dones = {agent:dones[0].copy() for agent in all_agents}
self._cumulative_rewards = {agent:rewards[0]/len(all_agents) for agent in all_agents}
# reset actions now, as we already used them in the environment
np.copyto(self.actions, self._reset_actions)
def observe(self, agent):
return self.obs[0][agent]
@property
def max_num_agents(self):
return self.height * self.width
@property
def game_state(self):
return self.vec_client[0].gs
@property
def agents(self):
return [u.getPosition() for u in game_state.getUnits()]
Thanks for sharing, this definitely gives me a nice place to start.
@kachayev thanks for sharing this!
I think that the use case of having API for 2 players would be much easier
I agree. My first thought on this is gym-microrts's pettingzoo API should be very similar to chess's pettingzoo API that only has two players: https://www.pettingzoo.ml/classic/chess
@BolunDai0216 Absolutely!
@vwxyzjn if my memory doesn't fail me, chess is also implemented as AEC. So the API would look the same. I meant the implementation would be easier with static number of agents
TLDR: Petting Zoo has become the standard library for getting multi-agent environments & we want to support Petting Zoo's bindings in gym-microrts.
This project https://github.com/vwxyzjn/gym-microrts is an RL environment for RTS game, where lots of units are always spawning and dying. Because of the multi-agent nature of RTS games, gym-microrts should fit with PettingZoo’s interface pretty seamlessly.
We currently need help on the following fronts:
Setting up an issue to track progress.
@BolunDai0216 suggests he would like to take a stab at this.