PacktPublishing / Deep-Reinforcement-Learning-Hands-On

Hands-on Deep Reinforcement Learning, published by Packt
MIT License
2.83k stars 1.29k forks source link

Advantage computation fix - Chapter 8 models.py #9

Closed Sycor4x closed 5 years ago

Sycor4x commented 5 years ago

Fixing bug that computes advantage values as an average across the minibatch, instead of one advantage value per sample. See: https://github.com/PacktPublishing/Deep-Reinforcement-Learning-Hands-On/issues/6

Shmuma commented 5 years ago

Thanks!