Kaixhin / Rainbow

Rainbow: Combining Improvements in Deep Reinforcement Learning
MIT License
1.56k stars 282 forks source link

Memory capacity for example data-efficient Rainbow? #55

Closed guydav closed 4 years ago

guydav commented 4 years ago

Hi folks,

I'm running the data-efficient Rainbow as a baseline for a project I'm starting, and one thing isn't making sense in my head. The original Rainbow paper uses a 1M transition buffer, and comparatively, the data-efficient paper (Appendix E) claims to use an unbounded memory.

Do you have any sense of what does an unbounded memory even mean in practice? Is there any particular reason you chose to make it smaller than the default Rainbow's memory buffer, rather than larger?

Thank you!

Kaixhin commented 4 years ago

I understood it as being at least the capacity of the length they train it for, hence I set both T-max and memory-capacity to 100000 (because this prevents making a 1M transition buffer which won't be used fully).

guydav commented 4 years ago

Makes sense. It might be more sensible to add an argument that does this explicitly, rather than implicitly? Because if you increase the number of training iterations, you need to know you should also separately increase the memory size.

Kaixhin commented 4 years ago

Added a note in the readme to explain my reasoning for this. Think knowing about this is better than naively setting --unbounded for a large T-max.

guydav commented 4 years ago

Makes sense -- that's probably an even better solution.