This is a bug in the trajectorywriter/offline dataset where we end up truncating some trajectories when we finish online training and this leads to having “short” truncated trajectories, which are bad for our data. It would be good to remove them. They are visible in the visualization of the reward over traj-lengths as spots on the x-axis but not at max-length.
This is a bug in the trajectorywriter/offline dataset where we end up truncating some trajectories when we finish online training and this leads to having “short” truncated trajectories, which are bad for our data. It would be good to remove them. They are visible in the visualization of the reward over traj-lengths as spots on the x-axis but not at max-length.
A link to the method I use to ensure that these get labelled as truncated to avoid bugs: https://github.com/jbloomAus/DecisionTransformerInterpretability/blob/c84edb381c53b3f9ef2fa9517e34914a52e15fbd/src/utils.py#L59