Thank you so much for the tutorial and implementation on Priority Experience Sampling. I am currently trying to implement this code with an OpenAI gym environment for Atari games using Double Deep Q Networks. I used the same code as a starting point to sample prioritised experiences. The code runs fine until 10000 steps (which is the memory buffer that I have set), and returns the sampling_batches as an empty list later. I am not sure why this happens. Did you encounter this problem before?
Hi,
Thank you so much for the tutorial and implementation on Priority Experience Sampling. I am currently trying to implement this code with an OpenAI gym environment for Atari games using Double Deep Q Networks. I used the same code as a starting point to sample prioritised experiences. The code runs fine until 10000 steps (which is the memory buffer that I have set), and returns the sampling_batches as an empty list later. I am not sure why this happens. Did you encounter this problem before?
Thanks, Aditya