danijar / dreamer

Dream to Control: Learning Behaviors by Latent Imagination
https://danijar.com/dreamer
MIT License
506 stars 109 forks source link

Provided scores don't match the results #55

Closed zhixuan-lin closed 2 years ago

zhixuan-lin commented 2 years ago

Hello,

Thanks for the code! Just want to confirm, are the scores in dreamer.json from the old implementation or this TF2 implementation? This was asked before but not yet answered (and I also encounter the same reproducibility issue for Pendulum Swingup).

zhixuan-lin commented 2 years ago

Screen Shot 2022-04-22 at 12 21 49 PM The line Dreamer comes from dreamer.json and Dreamer-TF2 is the results I got by running the code in this repo, averaged over 3 seeds.

danijar commented 2 years ago

Hi, the scores are from the same code base but with an earlier hyper parameter setting. The new hyper parameters work better in general but on this task learning slowed down a little bit. Either way, I would recommend to run DreamerV2. Hope that helps!