alessiodm / drl-zh

Deep Reinforcement Learning: Zero to Hero!
MIT License
1.97k stars 69 forks source link

03_DQN.ipynb - ease of use improvements #4

Closed fancyfredbot closed 2 months ago

fancyfredbot commented 2 months ago

I wanted to make two more suggestions about the 03_DQN.ipynb notebook:

1) The default size of the replay buffer uses a lot of memory and this caused me a few out of memory problems. I have found a 10k rather than 100K replay buffer is fine and still converges nicely. Also lets you keep all the states on my 4GB 3050 laptop GPU which is nice.

2) In order to get the test for the QNetwork to pass you have to use no biases in the first conv2D layer. I had to cheat and look at the solution to figure that out. I am still not really sure why we don't use a bias in that layer!

I hope these comments are helpful - I am really happy with these notebooks and found them very useful even without youtube videos!

alessiodm commented 2 months ago

Your comments are incredibly useful, thank you very much for all your feedback both in this issue and the one about PG (which I am addressing soon!)

I integrated both 1. and 2. (i.e., changed the replay buffer size to 10k, and removed the spurious no-bias in the convnet first layer).