Aaditya-Prasad / consistency-policy

[RSS 2024] Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation
https://consistency-policy.github.io/
MIT License
102 stars 8 forks source link

How many actions in the output action sequence should be execute in each policy control loop? #6

Closed HshineO closed 1 month ago

HshineO commented 1 month ago

Hello! Thanks for your excellent work! I have a question about real world experiment. In each policy control loop , should I inference the model to output the action sequence once , and execute the first action in the sequence(because of the high inference frequency of CP) , or should I inference the model and execute all the new actions just like Diffusion Policy? Can you share the code used in the real world experiment? Thank you very much!

Aaditya-Prasad commented 1 month ago

Hi! We used policy_wrappy.py in our real world experiments, which does execute the entire action horizon before predicting a new action chunk. We did this to maintain a standard of comparison against Diffusion Policy. If Consistency Policy has excess inference speed for your setup, I think it makes sense to try decreasing the number of actions that are executed after each prediction (making predictions "fresher").

There is one concern here, which is that even if you maintain the same prediction horizon (i.e. the number of steps into the future that you predict, which upper bounds the number you deploy before a new prediction), you still might get different behavior because you're introducing more sampling steps that aren't conditioned on previous generations. This is something to test, and is likely task + dataset dependent on how much it affects you.