Closed eliork closed 3 years ago
That wrapper follows the frame-stacking idea of original DQN, where last four images are stacked on their channel-dimension, which in code translates to stacking the observations on last axis. This would mean your (1, 128) observations would become (1, 128 * 4), but do note that this wrapper was not designed for non-image observations per se.
Edit: The extra line is for handling terminal states correctly. where we have to update the terminal_state
information in info dict with frame stacking (note that with vecenvs you do not normally receive terminal observation so we have to use info dict to pass that forward).
This would mean your (1, 128) observations would become (1, 128 * 4), but do note that this wrapper was not designed for non-image observations per se.
Thanks, I am trying to stack 4 observations together, each represent the latent space of a VAE encoder. My thought was to take the same theory about frame stacking, but to apply it to a VAE latent space, thinking it does represent an image, or am I completely wrong about my thought process, according to what you wrote that the wrapper was not designed for non image observations?
Thank you for your detailed explanation
Linking related issue: https://github.com/araffin/learning-to-drive-in-5-minutes/issues/36
@eliork I suggest you to take a look at the link araffin provided (very similar setup). Frame stacking can kind of work here: the latent codes of the four images are fed to the network and it will happily process them. You will lose the temporal information but same happens with the Atari's way of stacking frames. I expect this solution would work better than just feeding a single frame, but I can not say for sure and you need to run experiments to see what works best for your solution.
Will do, Thank you very much!
https://github.com/hill-a/stable-baselines/blob/259f27868f0d727d990f50e04da6e3a5d5367582/stable_baselines/common/vec_env/vec_frame_stack.py#L27-L43
Hi, I am trying to read this code, and I am having difficulties understanding how is the stacking made? for example if I have a single observation of shape (1,128) and I want to stack 4 observations, does observations hold the 4 observations together? where is the 4 observations concatenated? and also what does this line mean?
self.stackedobs[..., -observations.shape[-1]:] = observations
Thank you