Closed sai-prasanna closed 3 months ago
In dreamerv2 the flag mode=train was passed to the posterior computation in observe to use the modes of the stochastic state. I notice now that we always sample. Is it intended or inconsequential?
mode=train
observe
https://github.com/danijar/dreamerv3/blob/29eb964e2918a3f4db04086f7f51b60388e97f3d/dreamerv3/agent.py#L129
It's intended because it's easier and still works quite well.
In dreamerv2 the flag
mode=train
was passed to the posterior computation inobserve
to use the modes of the stochastic state. I notice now that we always sample. Is it intended or inconsequential?https://github.com/danijar/dreamerv3/blob/29eb964e2918a3f4db04086f7f51b60388e97f3d/dreamerv3/agent.py#L129