Closed yufei2933 closed 1 year ago
复现中发现一个问题,parl.MADDPG中网络更新: def learn(self, obs_n, act_n, target_q): """ update actor and critic model with MADDPG algorithm """ actor_cost = self._actor_learn(obs_n, act_n) critic_cost = self._critic_learn(obs_n, act_n, target_q) self.sync_target() return critic_cost 在example/maddpg中,仅对critic network进行了更新,似乎并没有对policy network进行更新,更新代码如下:
critic_cost = self.alg.learn(batch_obs_n, batch_act_n, target_q) critic_cost = float(critic_cost.cpu().detach()) return critic_cost
是在其他部分进行了更新吗?
No problem...
。。。
可以star下PARL关注后续升级哈:)
复现中发现一个问题,parl.MADDPG中网络更新: def learn(self, obs_n, act_n, target_q): """ update actor and critic model with MADDPG algorithm """ actor_cost = self._actor_learn(obs_n, act_n) critic_cost = self._critic_learn(obs_n, act_n, target_q) self.sync_target() return critic_cost 在example/maddpg中,仅对critic network进行了更新,似乎并没有对policy network进行更新,更新代码如下:
learn
是在其他部分进行了更新吗?