Closed anthony0727 closed 7 months ago
Hi @anthony0727, the notebook has been thought like this. Supopose that we want our agent to play for 200 sptes and while imagining for 45 steps (as by default in the notebook). Our final objective is to compare how the imagination differs from the real behaviour: in our example we want to compare the last 45 steps. So:
initial_steps
while saving everything in the rb_initial
bufferinitial_steps - imagination_steps - 1
-th step we save the recurrent and stochastic states: this will be used as a starting point for the imaginationimagination_steps
. During this time one can choose to really imagine actions or take the already played ones, with those actions that are used to compute the next stochastic and recurrent state from the world model, which are then used to reconstruct the image from the decoder. At the same time we reconstruct the image from the stochastic and recurrent states really played, so that we can also compare the reconstruction of the frames playedIs it more clear now?
Thanks for the reply!
but my qusestion is,
why isn't "next" stochastic used for next next stochastic like shown in behavior learning? https://github.com/Eclectic-Sheep/sheeprl/blob/2bae37985d789a67d569bf37ed937b9445ae9ab8/sheeprl/algos/dreamer_v3/dreamer_v3.py#L236
vs
# imagination step
imagined_stochastic_state, recurrent_state = world_model.rssm.imagination(
stochastic_state, recurrent_state, actions
)
You're super right! Thank you both @anthony0727 and @michele-milesi for spotting that! I thought i was going blind! I'll fix it up right now
Yup I think the fix is correct! We can close this issue!
I can't fully understand the imagination process in https://github.com/Eclectic-Sheep/sheeprl/blob/main/notebooks/dreamer_v3_imagination.ipynb,
from the "context" of beginning of imagination
the imagination is performed
but doesn't
stochastic_state_{t-1}
have to be fed intoworld_model.rssm.imagination
to outputstochastic_state_{t}
? i.e.Really appreciate if you could help me understand this process!