Thanks for the code! Just want to confirm, are the scores in dreamer.json from the old implementation or this TF2 implementation? This was asked before but not yet answered (and I also encounter the same reproducibility issue for Pendulum Swingup).
Hi, the scores are from the same code base but with an earlier hyper parameter setting. The new hyper parameters work better in general but on this task learning slowed down a little bit. Either way, I would recommend to run DreamerV2. Hope that helps!
Hello,
Thanks for the code! Just want to confirm, are the scores in
dreamer.json
from the old implementation or this TF2 implementation? This was asked before but not yet answered (and I also encounter the same reproducibility issue for Pendulum Swingup).