Chapter07/04_dqn_noisy_net.py stability

Hi Max,

I've found the NoisyNets implementation to have unstable training dynamics. In my experiments only 1-2 out of 5 runs converge when using the shortened Pong hyperparams (using both Independent Gaussians and Factored Gaussians). I've found that reducing the learning rate from 1e-4 to 5e-5 seems to increase the stability to 4-5 runs out of 5 with minimal increase to the convergence speed. I hope this helps anybody else out there who might be having trouble with it.

Cheers, Dave

PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Chapter07/04_dqn_noisy_net.py stability #31