yusukeurakami / dreamer-pytorch

pytorch-implementation of Dreamer (Model-based Image RL Algorithm)
MIT License
162 stars 34 forks source link

Reward loss timescale #5

Open roggirg opened 3 years ago

roggirg commented 3 years ago

Hi,

I believe the reward loss should be based on rewards[1:] instead of rewards[:-1]: https://github.com/yusukeurakami/dreamer-pytorch/blob/7e9050e8c454309de40bd0d1a4ec0256ef600147/main.py#L209

If not, can you please explain your reasoning? Thanks,