Unity-Technologies / ml-agents

The Unity Machine Learning Agents Toolkit (ML-Agents) is an open-source project that enables games and simulations to serve as environments for training intelligent agents using deep reinforcement learning and imitation learning.
https://unity.com/products/machine-learning-agents
Other
17.18k stars 4.16k forks source link

Continuous actions in PettingZoo API #6135

Closed AmineAndam04 closed 1 month ago

AmineAndam04 commented 2 months ago

Describe the bug in the the _update_action_spaces method of the UnityPettingzooBaseEnv class, you force the actions to be of int32 and in the [-1,1] range (see the code below). This constraint doesn't apply to all scenarios

act_spec.continuous_size > 0:
                    c_space = spaces.Box(
                        -1, 1, (act_spec.continuous_size,), dtype=np.int32
                    )

It makes more sense to use the same logic as in _update_observation_spaces.

obs_spaces = tuple(
                    spaces.Box(
                        low=-np.float32(np.inf),
                        high=np.float32(np.inf),
                        shape=spec.shape,
                        dtype=np.float32,
                    )

I'm aware that this issue was raised previously, but it is still present.

Console logs / stack traces The action to send is

'RegularUser?team=0?agent_id=0': array([ 0.        ,  1.        ,  0.        ,  1.25601637, -0.18061967,
         0.54454458, -0.43266517, -0.34340054, -0.40797555,  1.97381461,
        -0.90531713, -1.37672544,  0.24041645,  1.01703703, -1.38512528,
        -1.03809536,  0.89815623,  1.84892368,  0.07173234,  0.70097125,
        -0.12584344, -1.17439771,  0.59523576, -1.40064371,  0.12704335,
         0.83167762, -1.33476043, -0.19449671,  1.01432335, -0.28750473,
         1.29402018,  0.00865571, -0.25417686,  0.13685977, -0.44362086,
        -1.86416674, -0.71164739,  2.07690144, -0.7751472 ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ,  0.        ,
         0.        ,  0.        ,  0.        ,  0.        ]),

But I get this

  0  0 -1  0  1  0  1  0  0  0  0 -1  0  2  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0] but was expecting action from <bound method UnityPettingzooBaseEnv.action_space of <mlagents_envs.envs.unity_parallel_env.UnityParallelEnv object at 0x1078db010>>

Environment :

github-actions[bot] commented 1 month ago

This issue is stale because it has been open for 30 days with no activity.

github-actions[bot] commented 1 month ago

This issue was closed because it has been inactive for 14 days since being marked as stale. Please open a new issue for related bugs.