Fix PZ wrapper active agents calculation

KaleabTessera commented 2 years ago

PettingZoo uses self.agents to keep track of the active agents in the environment. In the Smac wrapper, self.agents is updated before self.env.step, meaning that self.agents is always a step behind. This results in all_observes including information from dead agents and it makes it difficult/impossible to know when the env is done when using the PZ wrapper (the usual env_done check is seeing if self.agents is empty).

Full discussion - https://github.com/instadeepai/Mava/issues/297. @rodrigodelazcano @benblack769

benblack769 commented 2 years ago

Thanks for doing this. Once the PZ tests are updated to check that this correctly implements the PZ API (in progress here, this will be a change I would support merging.

samvelyan commented 2 years ago

Thanks @KaleabTessera for your contribution. @rodrigodelazcano could you please review this?

rodrigodelazcano commented 2 years ago

Thanks @KaleabTessera for making the PR. However, I would prefer to wait until this issue https://github.com/PettingZoo-Team/PettingZoo/issues/464 is resolved and @benblack769 approves as well.

KaleabTessera commented 2 years ago

Thanks @KaleabTessera for making the PR. However, I would prefer to wait until this issue PettingZoo-Team/PettingZoo#464 is resolved and @benblack769 approves as well.

Okay, just let me know if you want me to change anything @rodrigodelazcano @benblack769 :+1:

rodrigodelazcano commented 2 years ago

Sure thing! Thank you for your interest @KaleabTessera

KaleabTessera commented 2 years ago

@rodrigodelazcano This issue https://github.com/PettingZoo-Team/PettingZoo/issues/464 seems to be closed. Can you review this PR now?

KaleabTessera commented 2 years ago

@benblack769 Just a friendly ping on this. :+1:

benblack769 commented 2 years ago

Ah, yes, thanks for the ping, I lost track of this. The intended change was made and released in pettingzoo 1.12.0

The agents list should now represent the agents that should take the next step in the environment.

I think the next step is to check that the API test for the SMAC pettingzoo wrapper passes, and if so, then this PR should be good.

KaleabTessera commented 2 years ago

@benblack769 @rodrigodelazcano Can you please confirm that these tests work on current version of smac?

Using PZ 1.12.0 (I also tried 1.11.0 and 1.11.1) and the master version of smac, running these tests (pytest smac/env/pettingzoo/test/ -x) fail. Since it appears to also fail in v1.11.1 of PZ, I don't think it is related to the new changes. Here is one of the errors:

            if isinstance(env.observation_space(agent), gym.spaces.Box):
                assert env.observation_space(agent).dtype == prev_observe.dtype
>           assert env.observation_space(agent).contains(prev_observe), \
                ("Out of bounds observation: " + str(prev_observe))
E           AssertionError: Out of bounds observation: {'observation': array([1.        , 1.        , 1.        , 1.        , 0.        ,
E                    0.        , 0.        , 0.        , 0.        , 0.        ,
E                    0.        , 0.        , 0.        , 0.        , 0.        ,
E                    0.        , 0.        , 0.        , 0.        , 1.        ,
E                    0.0764974 , 0.        , 0.0764974 , 1.        , 1.        ,
E                    0.09385597, 0.07411024, 0.05759006, 1.        , 1.        ],
E                   dtype=float32), 'action_mask': array([1, 1, 1, 1, 1, 0, 0, 0])}
E           assert False
E            +  where False = <bound method Dict.contains of Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box.... -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32))>({'action_mask': array([1, 1, 1, 1, 1, 0, 0, 0]), 'observation': array([1.        , 1.        , 1.        , 1.        ,...974 , 1.        , 1.        ,\n       0.09385597, 0.07411024, 0.05759006, 1.        , 1.        ],\n      dtype=float32)})
E            +    where <bound method Dict.contains of Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box.... -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32))> = Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box([-1. -1. -1. -1. -1. -1. -1. -...1. -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32)).contains
E            +      where Dict(action_mask:Box([0 0 0 0 0 0 0 0], [1 1 1 1 1 1 1 1], (8,), int8), observation:Box([-1. -1. -1. -1. -1. -1. -1. -...1. -1.], [1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1. 1.\n 1. 1. 1. 1. 1. 1.], (30,), float32)) = <bound method BaseWrapper.observation_space of <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>>('marine_0')
E            +        where <bound method BaseWrapper.observation_space of <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>> = <pettingzoo.utils.wrappers.order_enforcing.OrderEnforcingWrapper object at 0x7fe9e779ce80>.observation_space

../../../anaconda3/envs/smac/lib/python3.8/site-packages/pettingzoo/test/api_test.py:183: AssertionError

I have also attached my requirements file. requirements.txt

benblack769 commented 2 years ago

Ah, there were a few updates to gym which made observation space checking much more strict. I can take a look.

benblack769 commented 2 years ago

@KaleabTessera I went ahead and made the wrapper comply with the pettingzoo API tests here: #84

Thanks for your work on this, your insights here really pushed a full API change to the whole pettingzoo ecosystem.

oxwhirl / smac

Fix PZ wrapper active agents calculation #81