Closed smorad closed 2 years ago
Hi,
To make BPTT over chunck of N timesteps, you can do as shown in the A2C implementation:
Basically, the idea is: you sample N timesteps in the workspace, then you copy the last timestep at position 0, and you continue the sampling from timestep 1. It allows you to 'split' trajectories in pieces while not loosing the continuity in the acquisition. Each backward will thus backpropagate over the N last timesteps.
The copy_n_last_steps
methods will do a copy of the n
last timesteps in the workspace to the n
first, and can be used with n>1
to generate sliding windows as it is done for instance in R2D2.
Does it answer your question ?
Yes, this is precisely it. Thanks!
Hello,
I'm interested with loading and storing recurrent states for training over longer episodes. This is generally called truncated back propagation through time (BPTT). For example, in the following case we break each trajectory into 80-timestep chunks:
Currently, if an episode is > 80 timesteps, it will receive a recurrent state of zeros. Does Salina provide a way to load the previous recurrent state?