jbr-ai-labs / mamba

This code accompanies the paper "Scalable Multi-Agent Model-Based Reinforcement Learning".
MIT License
46 stars 9 forks source link

Did you remove the transformer and run the experiments? #4

Closed GoingMyWay closed 2 years ago

GoingMyWay commented 2 years ago

Hi, the transformer is used in your code. Without the transformer, can your method also gain good performance?

vladimirrim commented 2 years ago

Hi, we didn’t conduct ablation without the transformer as it is a vital part of the architecture, but it would be interesting to check the performance in this setup. I hypothesize that the performance would drop significantly as the original dreamer algorithm performs poorly in marl environments

GoingMyWay commented 2 years ago

@vladimirrim, Hi Thanks for the clarification. I am also curious that during the trajectory generation procedure, is the transformer also used in "imagination"?

vladimirrim commented 2 years ago

Yes, It is important both for training and execution phases.

GoingMyWay commented 2 years ago

Yes, It is important both for training and execution phases.

Dear @vladimirrim, I see. Did some reviewers raise concerns about using the transformer in "imagination"? In MARL, it is challenging to train a world model without using global information since agents have partial observations.

GoingMyWay commented 2 years ago

Dear @vladimirrim, I am curious whether you tried another model-based modelling method for MARL. I found without exploiting the global information, the most useful method is to mix all local information to adjust the training of each agent's local world model, like CTDE or the mixing network of QMIX and other Dec-POMDP-based MARL methods. Incorporating the world model into MARL is non-trivial.