关于loss计算的问题

starry-sky6688 / MARL-Algorithms

Implementations of IQL, QMIX, VDN, COMA, QTRAN, MAVEN, CommNet, DyMA-CL, and G2ANet on SMAC, the decentralised micromanagement scenario of StarCraft II

1.46k stars 283 forks source link

关于loss计算的问题 #75

Closed Duke-Allen closed 2 years ago

Duke-Allen commented 2 years ago

您好，又有一些问题想请教一下。我看到coma和reinforce这些算法中计算loss是都有乘以mask，然后再除以mask.sum，这块是为什么呢？没太理解，可以解释一下吗

starry-sky6688 commented 2 years ago

因为训练的时候以episode为单位，为了保证一个batch中的episode长度相同，对很快完成的episode用0进行了填充，这些填充的step对应的mask为0，从而防止这些填充的数据产生梯度影响训练。

除以mask.sum是因为loss取均值