Explanation of Q statistics in plots for "val_mem"?

Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning

MIT License

1.59k stars 284 forks source link

Explanation of Q statistics in plots for "val_mem"? #70

Closed andrewjmcgehee closed 4 years ago

andrewjmcgehee commented 4 years ago

I'm not sure I understand the plots for Q. In the main.py code the val_mem is only ever updated with new states once. But in test.py the Q is evaluated at each new evaluation interval. So is this plot effectively graphing the Q assigned to the random states which are initially put into val_mem? Sorry if this is an obvious question, I'm brand new to deep RL research and this re-implementation of rainbow is really fascinating.

Kaixhin commented 4 years ago

Yes, you understood correctly. The upside is that the Q-values are always evaluated against a fixed dataset, the downside is that it doesn't contain a representative set of states from the environment.