starry-sky6688 / MADDPG

Pytorch implementation of the MARL algorithm, MADDPG, which correspondings to the paper "Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments".
516 stars 80 forks source link

关于done的处理 #37

Closed lgzid closed 10 months ago

lgzid commented 10 months ago

我在MADDPG.train函数计算q_target部分中并没有对done为True或False进行处理呢,replay_buffer中也没有存储done的值,这样会不会有些不妥呢?

starry-sky6688 commented 10 months ago

不会,因为这个任务本身就没有结束的条件,是一个continual case