FLAIROx / JaxMARL

Multi-Agent Reinforcement Learning with JAX
Apache License 2.0
393 stars 68 forks source link

Unable to apply Q-learning baselines on envs with non-homogeneous agents #70

Closed zez2001 closed 5 months ago

zez2001 commented 5 months ago

Your q-learning baselines assume all agents have same obs and action dim, however in MPE_Single_Tag there are two different obs sizes.How can i apply qmix.py on MPE_Single_Tag?

mttga commented 5 months ago

Hi @zez2001,

The Q-learning algorithms do support environments with different observation/action spaces. These are unified by the CTRolloutManager.

You can easily run the Q-learning algorithms on MPE_simple_tag_v3 (I assume you're referring to this scenario?) with an environment configuration like this (let's say it is baselines/QLearning/config/env/mpe_simple_tag.yaml):

"ENV_NAME": "MPE_simple_tag_v3"
"ENV_KWARGS": {}

and by running:

python baselines/QLearning/qmix.py +alg=qmix_mpe +env=mpe_simple_tag

However, I'm not sure about running QMIX on this environment. Simple tag is a cooperative-competitive environment, and QMIX is designed only for cooperative tasks. I would focus on iql instead.

Let me know if this helps.

zez2001 commented 5 months ago

Thank you for your answer,

I check your CTRolloutManager,and it seems that you pad the observation vectors to the maximum length.I wonder whether i can use this on MPE_simple_tag_v3,cause they have different obs size.If i just pad the vector with values =0,maybe i will get the inappropriate params when i apply them into q-network?

mttga commented 5 months ago

This is a standard practice in marl. Closing this because I don't see any codebase issues.