real-stanford / diffusion_policy

[RSS 2023] Diffusion Policy Visuomotor Policy Learning via Action Diffusion
https://diffusion-policy.cs.columbia.edu/
MIT License
1.39k stars 262 forks source link

Regarding pretrained models #14

Open LostXine opened 1 year ago

LostXine commented 1 year ago

Hello,

Thanks again for this great project. It would be great if you could help diagnose this issue regarding the pretrained models.

When I try to evaluate your pretrained models of the hybrid CNN setting, I found them not working properly on Push-T, Transport ph, and transport mh. There could be more but I haven't tried them yet. Basically, the action trajectory is relatively reasonable (not random noisy actions), but the agent just could not finish the task. (Push-T mean score: 0.09, Transport mean score: 0) However, when I tried to train a model from scratch and evaluate it, it works fine, which may indicate that the evaluation code is correct. I tested them on two machines at different locations, and all models are directly downloaded from your website. I also performed the integrity check and confirm that the two copies of the models on the two machines are identical. The training code can properly load the model file and the num of epochs matches the filename. But, it just does not generate the correct actions. After days of debugging, I could not find any possible directions to look into.

So could you please share some insights on what may cause this issue?

Thank you so much!

Best regards,

cheng-chi commented 1 year ago

Hi @LostXine Interesing, do you mind shearing your code for evaluating on pretrained models?

LostXine commented 1 year ago

Hi @cheng-chi ,

Thanks a lot for your response. I tried two versions:

  1. Untouched training code from this repo (diffusion_policy/workspace/train_diffusion_unet_hybrid_workspace.py).
  2. A simplified version whose core run function is like:
# ========= eval for this epoch ==========
policy = self.model
if cfg.training.use_ema:
    policy = self.ema_model
policy.to(device)
policy.eval()

# configure env
env_runner: BaseImageRunner
env_runner = hydra.utils.instantiate(cfg.task.env_runner, output_dir=self.output_dir)
assert isinstance(env_runner, BaseImageRunner)
# run rollout
runner_log = env_runner.run(policy)
del env_runner
# log all
step_log.update(runner_log)
json_logger.log(step_log)
print(step_log)

They have the same behavior that only the model I trained works. I could also check the hash sum of the model I downloaded if you believe that is helpful.

Thank you so much!

cheng-chi commented 1 year ago

Hi @LostXine, I'm not sure what exactly is the problem in your script, but I have just created a script (that is fairly similar to what you have) that can evaluate all provided checkpoints. Please checkout the updated README for usage. On Push-T lowdim + Diffusion Policy CNN I'm getting "test/mean_score": 0.9150393806777066 using epoch=0550-test_mean_score=0.969.ckpt. On Push-T Image + Diffusion Policy CNN I'm getting "test/mean_score": 0.9177610059407988 using epoch=0500-test_mean_score=0.884.ckpt.

LostXine commented 1 year ago

Hi @cheng-chi , thank you so much for your effort. I'll check it and get back to you soon. Best regards,

LostXine commented 1 year ago

Hi @cheng-chi Thanks for all your efforts, we finally figured it out. It turns out that the order of the states in policy.shape_meta.obs will change the order of the state features (order of the channels) in global_cond tensor. The config files currently listed on the website do not match the config in the checkpoints in terms of the order of the states, though they have the same value. As a result, the order of the channels of global_cond will be different which causes unexpected behavior. Hope it helps, and thanks again.

cheng-chi commented 1 year ago

@LostXine Oh great! Good to know! I will probably add sorting for the keys in the future. Dependent on yaml ordering is indeed a bit problematic.