PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Hands-on Deep Reinforcement Learning, published by Packt
MIT License
2.83k stars 1.28k forks source link

Chapter13: wob_click_train.py not getting 0.8 mean reward even after 200k #27

Open hemanthsavasere opened 5 years ago

hemanthsavasere commented 5 years ago

Hi @Shmuma

wob_click_train.py not getting a mean reward even after running it for 250k. As mentioned in your book, it should have reached this stage at 200k steps. Training on GCP with Nvidia Tesla K80 GPU.

Thanks