Closed HiddeLekanne closed 6 months ago
same should be for the .mode versions in DreamerV2, because for a normal distribution the mode equals the mean.
Hi @HiddeLekanne, yeah you're right: this should give us the same trained agent. I will try it asap
Lines like:
and
Don't do anything.
It's because you're not sampling from the distribution, your simply taking the mean, which is what you started with anyways. You can confirm it by running a training session with and without the whole distribution creation and see that the model learns exactly the same thing.
Lines are from DreamerV1