yusukeurakami / dreamer-pytorch

pytorch-implementation of Dreamer (Model-based Image RL Algorithm)
MIT License
162 stars 34 forks source link

About retrieving one batch from the memory for training #1

Closed ychenco closed 4 years ago

ychenco commented 4 years ago

https://github.com/yusukeurakami/dreamer-pytorch/blob/7e9050e8c454309de40bd0d1a4ec0256ef600147/memory.py#L33-L39

The sampling function seems not consider the cases that the sampled sequence could cross episodes? Will that be an influence on the performance?

yusukeurakami commented 4 years ago

@ychenco Hi! Sorry for the late reply. I am not sure that I understood your question correctly, but this function is making sure to sample the trajectory only in the same episode. That is because the consistency of the sequential state-action relationship will be broken by crossing the episodes. I have not checked whether this affects the performance or not, but my guess is that sampling the episode crossed trajectory will worsen the performance.