PaddlePaddle / PARL

A high-performance distributed training framework for Reinforcement Learning
https://parl.readthedocs.io/
Apache License 2.0
3.25k stars 820 forks source link

parl examples/MADDPG网络更新的问题 #988

Closed yufei2933 closed 1 year ago

yufei2933 commented 1 year ago

复现中发现一个问题,parl.MADDPG中网络更新: def learn(self, obs_n, act_n, target_q): """ update actor and critic model with MADDPG algorithm """ actor_cost = self._actor_learn(obs_n, act_n) critic_cost = self._critic_learn(obs_n, act_n, target_q) self.sync_target() return critic_cost 在example/maddpg中,仅对critic network进行了更新,似乎并没有对policy network进行更新,更新代码如下:

learn

    critic_cost = self.alg.learn(batch_obs_n, batch_act_n, target_q)
    critic_cost = float(critic_cost.cpu().detach())

    return critic_cost

是在其他部分进行了更新吗?

yufei2933 commented 1 year ago

No problem...

TomorrowIsAnOtherDay commented 1 year ago

。。。

TomorrowIsAnOtherDay commented 1 year ago

可以star下PARL关注后续升级哈:)