Closed mbreuss closed 3 months ago
Hi @mbreuss,
Sorry I haven't run MDT or measured the inference speed of GR-1 yet...but I did think about it:
By the way, I have some questions about accelerating diffusion models:
About using CLA loss in my network:
But there are some confusing phenomenons:
StarCycle
Thanks for the detailed answer!
the GR-1 part is not that important I was just curious :D
I believe that a middle way between replanning every single step and full trajectory rollout would be ideal. Maybe with some kind of gaiting that enables the model to actively replan if certain conditions are met to enable fast reactions in cases of emergency would be my best bet for the future
Accelerating Diffusion Models
Regarding your questions for MDT and VLM Diffusion
Choice of Diffusion Head
Why are you using the default Diff Policy Transformer and not the MDT Decoder link? I can highly recommend you to use the FiLM Conditioned Decoder from MDT to conditioned on the noise level effectively and separate the noise token from the other state and goal tokens! Otherwise, especially in settings withy many obs and goal tokens the model is not great. I compared the MDT architecture with the default Diff-T on CALVIN and it only achieves like 1.5 avrg rollout length.
Which part of the VLM are you training for Florence? Can you encode multiple camera views with it already?
Did you try out different ways to use the tokens for the action head? Did you test to use all output tokens as well?
Weird Rollout Behavior
Here is some low level example for that: The model is only trained on the demos in a small area of [-6, 6]. The DP generalizes the overall trend. I averaged 100 action predictions and the low variance in x-areas below -6 and above 6 shows that the model is pretty certain on what it is doing although it has never seen these states.
Video Diffusion
GENIE
That looks super interesting! Keep me updated on this, I believe that GENIE offers many interesting applications for robotics!
If you want you can drop me an email to moritz.reuss@kit.edu when you have more questions regarding Diffusion Policies and similar ideas! I would be interested to discuss more.
Best, Moritz
Great thanks! I sent an email to you!
Hi @StarCycle
thanks for your contributions toward making GR-1 fully open source! I was curious about the inference speed of GR-1 compared to our MDT policy (if you tried it too) Can you share some of your experiences?:)
Thanks!