buoyancy99 / diffusion-forcing

code for "Diffusion Forcing: Next-token Prediction Meets Full-Sequence Diffusion"
Other
494 stars 19 forks source link

guidance_scale in df_planning.yaml #2

Closed b18arundhati closed 1 month ago

b18arundhati commented 2 months ago

It crashes when guidance_scale is set to non-zero (I have tried setting it to 1 and 10) for the planning experiment. Are there any other parameters that also need to be set for reward-guided planning to work?

buoyancy99 commented 2 months ago

Hi, can you provide me with more details? Which environment is this and did you use the suggested commands? Did you train it to ~100k steps?

buoyancy99 commented 1 month ago

Sorry about the issue, this is confirmed to be a bug when cleaning up the code. I will push the fix among a much improved version with my new transformer implementation of df_planning next week, with much better results and much faster speed!

buoyancy99 commented 1 month ago

It turns out I need more time for the transformer version release. However, I can also reproduce the result with a simple change in normalization:

First git pull the repo to latest version.

Train 100k steps with python -m main +name=original_medium_sample20 experiment=exp_planning dataset=maze2d_medium algorithm=df_planning dataset.action_std=[2,2]

Test with by adding load={wandb_run_id} algorithm.guidance_scale=10 experiment.tasks=[validation] (The guidance scale can be tuned between 10 to 20)

I also trained a fresh ckpt on my side with this exact command. Download it from google drive to your project root.

Then tar -xzvf medium_a2std_ckpt.tar.gz to extract and test with python -m main +name=original_medium_sample20 experiment=exp_planning dataset=maze2d_medium algorithm=df_planning dataset.action_std=[2,2] load=outputs/medium_a2std.ckpt algorithm.guidance_scale=20 experiment.tasks=[validation]

buoyancy99 commented 1 month ago

For those who come back to this post, I just released the transformer version on main branch. Check it out and here are some visualizations of it

Screenshot 2024-07-30 at 6 38 36 PM