Concatenation of memories with not terminated episode

https://github.com/Khrylx/PyTorch-RL/blob/61960d516c85e912e476b41f764a8a5f8cf38cf8/core/agent.py#L47

Hi,

Thank you for your code that is really well written! From my understanding, mask is 0 at the end of an episode and 1 otherwise. But, there will be a problem if you concatenate a memory M1 (where the last episode is not terminated) with a memory M2 because after concatenation, the computation of returns will be wrong.

To correct this, I think mask should be 0 at the end of an episode or if num_steps = min_batch_size - 1.

Lucas

Khrylx / PyTorch-RL

Concatenation of memories with not terminated episode #5