I'm intrigued by the results showcased on your website, specifically the one related to Task Train D -> Test D. Upon reviewing the methods used, I noticed that there is a technique referred to as "baseline + delta actions".
Would you mind elaborating on what the term "delta actions" refers to in this particular context? Is the delta action learned by some learning methods like residual policy learning or something else? I would appreciate more information to further my understanding of the process.
delta actions just refers to relative actions, instead of learning the position of the TCP in the world, you learn to rpedict the relative displacement to the next step.
Hello there!
I'm intrigued by the results showcased on your website, specifically the one related to Task Train D -> Test D. Upon reviewing the methods used, I noticed that there is a technique referred to as "baseline + delta actions".
Would you mind elaborating on what the term "delta actions" refers to in this particular context? Is the delta action learned by some learning methods like residual policy learning or something else? I would appreciate more information to further my understanding of the process.
Thank you!