Need some clarifications about details in Atari env

danijar / dreamerv3

Mastering Diverse Domains through World Models

MIT License

1.28k stars 219 forks source link

Hi, in the embodied/envs/atari.py, I don't get the intention of some parts of the code:

in def _reset(self): for i, dst in enumerate(self.buffers): if i > 0: np.copyto(self.buffers[0], dst)

What's the purpose of doing the np.copyto? is it some sort of resetting of the image buffer?

in def _obs(self, reward, is_first=False, is_last=False, is_terminal=False): if self.aggregate == 'max': image = np.amax(self.buffers, 0) elif self.aggregate == 'mean': image = np.mean(self.buffers, 0).astype(np.uint8)

Why we need to do np.amax or np.amean across the image buffer? Or is there some potential issues if we just use the self.buffers[0]?

Could you clarify a bit regarding these points? Thank you very much.

danijar / dreamerv3