I have a question regarding the OnPolicyBaseRunner. As the init function, if actors are parameter sharing, the self.actor.append(self.actor[0]). It is a list with N identical actor. Then why in run function, only apply lr_decay to the first actor? self.share_param: self.actor[0].lr_decay(episode, episodes).
Hello, thank you for acknowledging our work. If actors share parameters, they all correspond to actor[0]. Thus decaying the lr of actor[0] should suffice.
Hi, Thank you for your amazing work.
I have a question regarding the OnPolicyBaseRunner. As the init function, if actors are parameter sharing, the self.actor.append(self.actor[0]). It is a list with N identical actor. Then why in run function, only apply lr_decay to the first actor? self.share_param: self.actor[0].lr_decay(episode, episodes).
Thank you and look forward to your reply.