Open Ashutosh-Adhikari opened 6 years ago
Quick questions. Prioritized experience replay is a just sampling method. It should only affect the way we sample from replay buffer. Why changing the loss function to be a weighted one? Have you tested the performance?
Please have a look at the Line 13 of the algorithm described in PER paper. I had only checked for Breakout. Gave a slight performance improvement over DQN. Please have a look at Kaixhin/Rainbow#15 and let me know.
Hi, Did you get time to check the code through tests? :)
Hi,
The reference code for PER additions : https://github.com/Kaixhin/Rainbow.git.
It is segment tree based implementation of PER.