Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Updating Priorities with Importance Weighted Loss instead of TD-Error #24

Closed nasimrahaman closed 6 years ago

nasimrahaman commented 6 years ago

In this line, the priorities are updated with the importance sampled weights (see this line). This does not appear to be consistent with algorithm 1 of Schaul et al. 2016 - is this intentional?

P.S. great work!

Kaixhin commented 6 years ago

This is indeed intentional and based on personal correspondence - I've added a comment there and put a note in the repo wiki to reflect this. I don't actually know how much impact this has though, and I don't have the resources to test the differences.