leggedrobotics / rsl_rl

Fast and simple implementation of RL algorithms, designed to run fully on GPU.
Other
505 stars 156 forks source link

fixed a bug with rnn padding and unpadding #12

Closed b-vm closed 10 months ago

b-vm commented 11 months ago

The padding and unpadding of tensors for rnns was incorrect.

The padded obs and mask were not of the same shape. This might have only ocurred when max(traj_lens) < num_transitions_per_env

It is fixed now. Unfortunately, for unpadding, it is necessary to get at least one of the original dimensions as a parameter, as it is impossible to know the correct shape otherwise. So there is a bit of parameter passing.

TextZip commented 11 months ago

The padding and unpadding of tensors for rnns was incorrect.

Hey, Does the current repo throw any errors for you when you try to use ActorCriticRecurrent? I was able to train without any issues. (Was just curious if this was a silent failure or if there was an error)

Faced the same error can confirm that the patch fixes the issue.

b-vm commented 11 months ago

It was an explicit error, but only in the very specific case where I had a very tight termination condition. I think it lead to the rare case where all trajs are shorter than the specified traj length per env.

nikitardn commented 10 months ago

Thanks for pointing out this problem. I think we can solve it in a slightly simpler way without extra parameters. We can ensure that the padding is done correctly like this (): def split_and_pad_trajectories(tensor, dones): ... trajectories = trajectories + (torch.zeros(tensor.shape[0], tensor.shape[-1], device=tensor.device), ) # add at least one full length trajectory padded_trajectories = torch.nn.utils.rnn.pad_sequence(trajectories) padded_trajectories = padded_trajectories[:, :-1] # remove the added tensor ... Could you please check if this solves the problem?

b-vm commented 10 months ago

It works! Nice way of solving it :+1: Will commit in a second.

Padding it like this adds a lot of unused zeros which makes it a tiny bit slower(only like 1% in timesteps per sec in my env), but that is only the case in the beginning when trajs are short so it doesnt matter.

Mayankm96 commented 10 months ago

Closing this for now. Please reopen if the issue is still present on master. Thanks a lot!