Benchmark Question - Githubissues

Hello there!

I'm intrigued by the results showcased on your website, specifically the one related to Task Train D -> Test D. Upon reviewing the methods used, I noticed that there is a technique referred to as "baseline + delta actions".

Would you mind elaborating on what the term "delta actions" refers to in this particular context? Is the delta action learned by some learning methods like residual policy learning or something else? I would appreciate more information to further my understanding of the process.

Thank you!

mees / calvin

Benchmark Question #50