Closed KaleabTessera closed 2 years ago
Thanks for doing this. Once the PZ tests are updated to check that this correctly implements the PZ API (in progress here, this will be a change I would support merging.
Thanks @KaleabTessera for your contribution. @rodrigodelazcano could you please review this?
Thanks @KaleabTessera for making the PR. However, I would prefer to wait until this issue https://github.com/PettingZoo-Team/PettingZoo/issues/464 is resolved and @benblack769 approves as well.
Thanks @KaleabTessera for making the PR. However, I would prefer to wait until this issue PettingZoo-Team/PettingZoo#464 is resolved and @benblack769 approves as well.
Okay, just let me know if you want me to change anything @rodrigodelazcano @benblack769 :+1:
Sure thing! Thank you for your interest @KaleabTessera
@rodrigodelazcano This issue https://github.com/PettingZoo-Team/PettingZoo/issues/464 seems to be closed. Can you review this PR now?
@benblack769 Just a friendly ping on this. :+1:
Ah, yes, thanks for the ping, I lost track of this. The intended change was made and released in pettingzoo 1.12.0
The agents list should now represent the agents that should take the next step in the environment.
I think the next step is to check that the API test for the SMAC pettingzoo wrapper passes, and if so, then this PR should be good.
@benblack769 @rodrigodelazcano Can you please confirm that these tests work on current version of smac?
Using PZ 1.12.0 (I also tried 1.11.0 and 1.11.1) and the master version of smac, running these tests (pytest smac/env/pettingzoo/test/ -x
) fail. Since it appears to also fail in v1.11.1 of PZ, I don't think it is related to the new changes. Here is one of the errors:
if isinstance(env.observation_space(agent), gym.spaces.Box):
assert env.observation_space(agent).dtype == prev_observe.dtype
> assert env.observation_space(agent).contains(prev_observe), \
("Out of bounds observation: " + str(prev_observe))
E AssertionError: Out of bounds observation: {'observation': array([1. , 1. , 1. , 1. , 0. ,
E 0. , 0. , 0. , 0. , 0. ,
E 0. , 0. , 0. , 0. , 0. ,
E 0. , 0. , 0. , 0. , 1. ,
E 0.0764974 , 0. , 0.0764974 , 1. , 1. ,
E 0.09385597, 0.07411024, 0.05759006, 1. , 1. ],
E dtype=float32), 'action_mask': array([1, 1, 1, 1, 1, 0, 0, 0])}
E assert False
E + where False = <bound method Dict.contains of Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box.... -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32))>({'action_mask': array([1, 1, 1, 1, 1, 0, 0, 0]), 'observation': array([1. , 1. , 1. , 1. ,...974 , 1. , 1. ,\n 0.09385597, 0.07411024, 0.05759006, 1. , 1. ],\n dtype=float32)})
E + where <bound method Dict.contains of Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box.... -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32))> = Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box([-1. -1. -1. -1. -1. -1. -1. -...1. -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32)).contains
E + where Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box([-1. -1. -1. -1. -1. -1. -1. -...1. -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32)) = <bound method BaseWrapper.observation_space of <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>>('marine_0')
E + where <bound method BaseWrapper.observation_space of <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>> = <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>.observation_space
../../../anaconda3/envs/smac/lib/python3.8/site-packages/pettingzoo/test/api_test.py:183: AssertionError
I have also attached my requirements file. requirements.txt
Ah, there were a few updates to gym which made observation space checking much more strict. I can take a look.
@KaleabTessera I went ahead and made the wrapper comply with the pettingzoo API tests here: #84
Thanks for your work on this, your insights here really pushed a full API change to the whole pettingzoo ecosystem.
PettingZoo uses
self.agents
to keep track of the active agents in the environment. In the Smac wrapper,self.agents
is updated beforeself.env.step
, meaning thatself.agents
is always a step behind. This results inall_observes
including information from dead agents and it makes it difficult/impossible to know when the env is done when using the PZ wrapper (the usual env_done check is seeing ifself.agents
is empty).Full discussion - https://github.com/instadeepai/Mava/issues/297. @rodrigodelazcano @benblack769