Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Fix max in prioritised experience replay #13

Closed Kaixhin closed 6 years ago

Kaixhin commented 6 years ago

Currently the "current" max, which should be used to initialise the priorities of new transitions, is set to the all-time max. Although this seems a small bug, it is still a bug. The best solution would be to combine the current sum tree with a max tree.