mees / calvin

CALVIN - A benchmark for Language-Conditioned Policy Learning for Long-Horizon Robot Manipulation Tasks
http://calvin.cs.uni-freiburg.de
MIT License
366 stars 55 forks source link

Benchmark Question #50

Closed Anchoret13 closed 1 year ago

Anchoret13 commented 1 year ago

Hello there!

I'm intrigued by the results showcased on your website, specifically the one related to Task Train D -> Test D. Upon reviewing the methods used, I noticed that there is a technique referred to as "baseline + delta actions".

Would you mind elaborating on what the term "delta actions" refers to in this particular context? Is the delta action learned by some learning methods like residual policy learning or something else? I would appreciate more information to further my understanding of the process.

Thank you!

mees commented 1 year ago

delta actions just refers to relative actions, instead of learning the position of the TCP in the world, you learn to rpedict the relative displacement to the next step.