How many actions in the output action sequence should be execute in each policy control loop?

Aaditya-Prasad / consistency-policy

[RSS 2024] Consistency Policy: Accelerated Visuomotor Policies via Consistency Distillation

MIT License

102 stars 8 forks source link

Hi! We used policy_wrappy.py in our real world experiments, which does execute the entire action horizon before predicting a new action chunk. We did this to maintain a standard of comparison against Diffusion Policy. If Consistency Policy has excess inference speed for your setup, I think it makes sense to try decreasing the number of actions that are executed after each prediction (making predictions "fresher").

There is one concern here, which is that even if you maintain the same prediction horizon (i.e. the number of steps into the future that you predict, which upper bounds the number you deploy before a new prediction), you still might get different behavior because you're introducing more sampling steps that aren't conditioned on previous generations. This is something to test, and is likely task + dataset dependent on how much it affects you.

Aaditya-Prasad / consistency-policy

How many actions in the output action sequence should be execute in each policy control loop? #6