Closed GoingMyWay closed 4 years ago
Dear chaovven, In https://github.com/chaovven/SMIX/blob/master/src/controllers/basic_controller.py#L75, I found there is only one controller defined here, for homogenous envs, I think it is ok to do so, it is homogenous agents and agents can learn one
agent
, however, for heterogeneous envs for example3s5z_vs_3s6z
, I think it is not ok to do so since there ares
s andz
s in each group, right? So, I think there should be defined newagent
for each type agent.
Hi, there is only one agent network that is shared by all agents to perform action selection. It's trivial for the homogenous case, as you said. For heterogeneous scenarios, it's achieved by assuming all agents have the same observation size, and at the same time, taking the max over all agents' available number of action as the extended action space for the shared network. By utilizing available_actions
returned by the environment, we can mask out unavailable actions for certain types of agents.
Dear chaovven, In https://github.com/chaovven/SMIX/blob/master/src/controllers/basic_controller.py#L75, I found there is only one controller defined here, for homogenous envs, I think it is ok to do so, it is homogenous agents and agents can learn one
agent
, however, for heterogeneous envs for example3s5z_vs_3s6z
, I think it is not ok to do so since there ares
s andz
s in each group, right? So, I think there should be defined newagent
for each type agent.Hi, there is only one agent network that is shared by all agents to perform action selection. It's trivial for the homogenous case, as you said. For heterogeneous scenarios, it's achieved by assuming all agents have the same observation size, and at the same time, taking the max over all agents' available number of action as the extended action space for the shared network. By utilizing
available_actions
returned by the environment, we can mask out unavailable actions for certain types of agents.
Yeah, that is true.
Dear chaovven, In https://github.com/chaovven/SMIX/blob/master/src/controllers/basic_controller.py#L75, I found there is only one controller defined here, for homogenous envs, I think it is ok to do so, it is homogenous agents and agents can learn one
agent
, however, for heterogeneous envs for example3s5z_vs_3s6z
, I think it is not ok to do so since there ares
s andz
s in each group, right? So, I think there should be defined newagent
for each type agent.Hi, there is only one agent network that is shared by all agents to perform action selection. It's trivial for the homogenous case, as you said. For heterogeneous scenarios, it's achieved by assuming all agents have the same observation size, and at the same time, taking the max over all agents' available number of action as the extended action space for the shared network. By utilizing
available_actions
returned by the environment, we can mask out unavailable actions for certain types of agents.
Hi, sir, did you try micro-trick
scenarios for example bane_vs_bane
? Based on your experience, can the pymarl and your code train micro-trick
without changing any code?
Dear chaovven, In https://github.com/chaovven/SMIX/blob/master/src/controllers/basic_controller.py#L75, I found there is only one controller defined here, for homogenous envs, I think it is ok to do so, it is homogenous agents and agents can learn one
agent
, however, for heterogeneous envs for example3s5z_vs_3s6z
, I think it is not ok to do so since there ares
s andz
s in each group, right? So, I think there should be defined newagent
for each type agent.