distinguish Qnet and set detach() on TD target

datawhalechina / easy-rl

强化学习中文教程（蘑菇书🍄），在线阅读地址：https://datawhalechina.github.io/easy-rl/

Other

9.04k stars 1.81k forks source link

distinguish Qnet and set detach() on TD target #117

Closed tlt18 closed 1 year ago

tlt18 commented 1 year ago

I found two problems with the Double DQN code.

TD target does not have detach(), which is equivalent to not using semi-gradient method;
During debugging, I found that the outputs of policy_net and target_net are the same, which is because no they maintain the same network, and target_net changes immediately after policy_net changes.

johnjim0816 commented 1 year ago

Thanks for your PR, I will consider you suggestions. But now we are trying to update new template for all algos, thus I cannot merge your PR now. I will add acknowledge of you when update Double DQN