LostXine / crossway_diffusion

The official code of our ICRA'24 paper Crossway Diffusion: Improving Diffusion-based Visuomotor Policy via Self-supervised Learning
MIT License
49 stars 2 forks source link

Code about ablations #5

Open tomo120717 opened 1 month ago

tomo120717 commented 1 month ago

Thank you for your great work and beautiful code.

I would like you to upload your code for ablations. You did a lot of ablations like visual-only and future prediction in your paper. I want to conduct them on my own.

Thanks,

LostXine commented 1 month ago

Hi @tomo120717 ,

Thanks again for your interest. I'll give you a quick note here to replicate the ablations.

  1. Visual-only just comment out this block which reconstructs the low-dim states: https://github.com/LostXine/crossway_diffusion/blob/d53a4a30e88ab03f75ec4cc1a01b372bd744cd51/diffusion_policy/model/diffusion/conditional_unet1d.py#L438-L441

  2. future prediction First, add the following code after this line

    future_nobs = dict_apply(nobs, 
                lambda x: x[:,self.n_pred_obs_shift: self.n_pred_obs_shift + self.n_obs_steps,...].reshape(-1,*x.shape[2:])) 
    # self. n_pred_obs_shift stands for how far the future you want to predict, when it is 0, that means reconstructing the current observation without future prediction.

    Then replace the following block https://github.com/LostXine/crossway_diffusion/blob/d53a4a30e88ab03f75ec4cc1a01b372bd744cd51/diffusion_policy/policy/crossway_diffusion_unet_hybrid_image_policy.py#L385-L388 with

    if rec.shape != future_nobs[key].shape:
    rec_target = T.Resize(rec.shape[-2:])(future_nobs[key])
    else:
    rec_target = future_nobs[key]

Best,

tomo120717 commented 1 month ago

Hi @LostXine ,

Thank you for your quick responce and good code. I understand what you did. I'll try it.

Best,

tomo120717 commented 1 month ago

Hi @LostXine ,

I tried your code about future predictoin, but it doesn't work well. The variable 'future_nobs' has 0 batch size, so it is not be able to be compared.

I print the shape of this nobs and future nobs shape key: agentview_image rec.size(): torch.Size([32, 3, 84, 84]) rec_target.size(): torch.Size([0, 3, 84, 84])

Did you change the robomimic_replay_image_dataset.py file from the diffusion policy one?

Please help me.

Best,

LostXine commented 1 month ago

Hi @tomo120717 , could you share your config file here? I want to check the value of horizon and n_pred_obs_shift, thank you.

tomo120717 commented 1 month ago

Hi @LostXine ,

Thank you for your reply. horizon = 16 and n_pred_obs_shift=2

Thank you.

LostXine commented 1 month ago

Hi @tomo120717 ,

Please delete this line and try again. https://github.com/LostXine/crossway_diffusion/blob/d53a4a30e88ab03f75ec4cc1a01b372bd744cd51/config/tool_hang_ph/typea.yaml#L137

The reason behind: https://github.com/LostXine/crossway_diffusion/blob/d53a4a30e88ab03f75ec4cc1a01b372bd744cd51/diffusion_policy/dataset/robomimic_replay_image_dataset.py#L217-L221

Thanks,

tomo120717 commented 1 month ago

Hi, @LostXine ,

Thank you for your replying!

The program looks working well by your advice! Additionaly, I want visualize the true future state image in wandb. How did you do this?

Best,

LostXine commented 1 month ago

Hi @tomo120717 ,

To upload images to wandb, you can refer to this link: https://docs.wandb.ai/guides/track/log/media

Thanks,

tomo120717 commented 1 month ago

Hi @LostXine ,

Sorry. My sentence was wrong.

What I wanted to say is in your code you visualize your reconstruction result and the GT. This program is for predict N=0, so you should change the program to predict future state.

This line https://github.com/LostXine/crossway_diffusion/blob/d53a4a30e88ab03f75ec4cc1a01b372bd744cd51/diffusion_policy/workspace/train_crossway_diffusion_unet_hybrid_workspace.py#L254

If you would be willing to change this part of the program, I would appreciate it.

Best,