Question about get action weights

starry-sky6688 / MARL-Algorithms

Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II

1.47k stars 283 forks source link

Thank you for your work. It's super helpful for beginners like me. Just a question about getting the action weights.

When we generate episodes with the rolloutWorker, we already had the action weights before we choose the actions, https://github.com/starry-sky6688/StarCraft/blob/2c07045f294ad4eeb5ab8a8d25cf43d0efea4cb3/common/rollout.py#L180

but when we calculate the loss in the agent during training, we calculate those action weights again, https://github.com/starry-sky6688/StarCraft/blob/2c07045f294ad4eeb5ab8a8d25cf43d0efea4cb3/policy/reinforce.py#L79

Are there any reasons why we shall do this instead of just throwing those weights into the epidode?

Thank you very much for your time.

starry-sky6688 / MARL-Algorithms

Question about get action weights #81