udacity / deep-reinforcement-learning

Repo for the Deep Reinforcement Learning Nanodegree program
https://www.udacity.com/course/deep-reinforcement-learning-nanodegree--nd893
MIT License
4.85k stars 2.34k forks source link

OUNoise should use normal distribution #19

Closed ghost closed 2 years ago

ghost commented 5 years ago

OUNoise should use normal distribution.

The current implementation uses random.random() which I believe is uniform distribution between [0,1). This can negatively affect exploration abilities of DDPG agent, since noise will have positive bias.

mean, std, min, max of OUNoise before fix: 0.6662002074296958 0.10970679264023238 0.05178335005859267 1.111826336326043

mean, std, min, max of OUNoise after fix: 0.002004800976725908 0.3797033350628932 -1.758558674034922 1.758029080992971

[issue: #20]

shpigi commented 2 years ago

Hey course staff- why hasn't this meen merged yet? It's 2022 :) ?

shpigi commented 2 years ago

:(