Closed chitwansaharia closed 5 years ago
Hi Chitwan!
I'm assuming this mask tells us the steps at which episodes are still active, not done? (this needs commenting). Are you saying it should be initialized with zeros instead? What effect would this error have?
Hi Maxime, I rechecked the code. It seems like this mask initialization would have been an issue but the memory initialization (zero at the start of every process) seems to take care of it. Basically the purpose that needed to be served was that the memory should be zero at the start of experience of each process. Since the memory is itself zero initialized, this will be taken care of.
I am closing this issue :)
self.mask
here is initialised as an array of 1s. However, when flattening out the mask here it will be buggy. Here is an example to illustrate the problem.num_procs = 2, num_frames_per_proc = 4
After accumulating the experience,self.masks
would be something like this :[1, x, x, x
1, x, x, x]
When flattening it out, it will look like this :[1, x, x, x, 1, x, x, x]
which should ideally be[0, x, x, x, 0, x, x, x]