shariqiqbal2810 / MAAC

Code for "Actor-Attention-Critic for Multi-Agent Reinforcement Learning" ICML 2019
MIT License
645 stars 169 forks source link

2 agents modification #8

Closed emanuelepesce closed 5 years ago

emanuelepesce commented 5 years ago

I noticed that the code creashes when 2 agents are used. since there are problems with the dimension for the sum function in critics.py line 138.

I managed to sort it out in this way:

for i, a_i in enumerate(agents):
    if max(agents) == 1:
        head_entropies = [(-((probs + 1e-8).log() * probs).squeeze().sum(0)
        .mean()) for probs in all_attend_probs[i]]
    else:
        head_entropies = [(-((probs + 1e-8).log() * probs).squeeze().sum(1)
                        .mean()) for probs in all_attend_probs[i]]

Does it sound good by you?

shariqiqbal2810 commented 5 years ago

Thanks for the heads up! Please feel free to make a pull request. We never tested with 2 agents since the attention mechanism isn't making any choices in that case (it operates over all the other agents of which there is only 1 when there are 2 agents).