marlbenchmark / off-policy

PyTorch implementations of popular off-policy multi-agent reinforcement learning algorithms, including QMix, VDN, MADDPG, and MATD3.
MIT License
386 stars 67 forks source link

训练奖励越来越低? #2

Closed HorizonLiang closed 2 years ago

HorizonLiang commented 2 years ago

运行train_mpe.py文件,日志显示奖励越来越低。

ollehhello commented 2 years ago

你好,我也有相同的问题,请问是什么情况呢?

ollehhello commented 2 years ago

我运行的是MADDPG