real-stanford / diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
https://diffusion-policy.cs.columbia.edu/
MIT License
1.1k stars 206 forks source link

ConditionalUnet1D up_modules #32

Closed janneshb closed 7 months ago

janneshb commented 7 months ago

Hi, thanks for this amazing project!

I'm having trouble using the ConditionalUnet1D as part of a custom low dimensional policy. I have a pretty simple set up with a set of 12-dimensional actions and corresponding 42-dimensional observations. I want to predict the action for a given observation. I.e. horizon=1, n_action_steps=1 and n_obs_steps=1.

When calling the ConditionalUnet1D model (i.e. the forward method) I keep getting a mismatch of dimensions error:

226        for idx, (resnet, resnet2, upsample) in enumerate(self.up_modules):
227            x = torch.cat((x, h.pop()), dim=1)

RuntimeError: Sizes of tensors must match except in dimension 1. Expected size 2 but got size 1 for tensor number 1 in the list.

Here is the link to the line repo.

This seems to stem from the previous iteration of the for loop where the upsample call returns a tensor with dimensions (256, 512, 2). This is incompatible with the next entry in the h list which has dimensions (256, 512, 1). Due to the mismatch in dimension 2, they cannot be concatenated along axis 1.

If I simply comment out the upsample call (i.e. this line) everything seems to be working fine and I even get reasonable results.

Might there be an issue with the upsample module or did I not configure my dimensions correctly?

Thanks! Jannes

cheng-chi commented 7 months ago

Hi @janneshb, unfotunately our ConditionalUnet1D as is can only predict action sequeces longer than 12 steps. If you really want action horizon == 1, you can predict 12 steps of actions and descard the rest 11. Just be warned that a decently long action horizon is critical for good performance. If your task enviornment only supports single step observation and action, it's better to change the enviornment. I suspect that the urge to make all polices fit in the MDP/gym mold has hindered progress in robotics.

janneshb commented 7 months ago

Hi @cheng-chi , thanks for the quick response. In our setting the direct observation-to-action relationship with horizon=1 makes a lot of sense. But I will try what you suggested and simply discard any future predictions.