PKU-MARL / Multi-Agent-Transformer

323 stars 59 forks source link

Questions about the training and execution paradigm? #13

Closed BigBearBlacken closed 1 year ago

BigBearBlacken commented 1 year ago

As shown in the Architecture of MAT, it seems that the encoder and decoder needs all the observations and actions of all agents in the training phase. But in the execution phase, agent can't obtain other agents' observations and actions. So how does each agent choose actions according to the actions of other agents recursively in the execution phase? Or the algorithm adopts centralized training and centralized execution paradigm?

morning9393 commented 1 year ago

Hiya,

Thanks for your attention, the vanilla MAT with recursion adopts centralized training and a centralized execution paradigm in deed, which concentrated more on modeling the interrelationship between agents. For decentralized scenarios, you may choose the MAT-dec version, which captures the interrelationship with an encoder and makes decisions with decentralized MLPs.

Hoping it is helpful, Muning

BigBearBlacken commented 1 year ago

Got it! Thanks for your reply!