philtabor / Deep-Q-Learning-Paper-To-Code

MIT License
351 stars 146 forks source link

No gradient required for q_target calculation #16

Open Kaustubh-Khedkar opened 2 years ago

Kaustubh-Khedkar commented 2 years ago

Hi @philtabor, There is a possible bug in the dqn_agent.py file at line 93:

q_target = rewards + self.gamma*q_next

needs to be replaced with:

with torch.no_grad():
      q_target = rewards + self.gamma*q_next

This issue is also raised in https://github.com/philtabor/Deep-Q-Learning-Paper-To-Code/issues/9#issue-723686152

Could you please take a look?