microsoft / LLF-Bench

A benchmark for evaluating learning agents based on just language feedback
https://microsoft.github.io/LLF-Bench/
MIT License
60 stars 12 forks source link

Why the metaworld env is truncated to 30 steps #11

Closed Jarvis-K closed 6 months ago

Jarvis-K commented 6 months ago

Why the meta world env is truncated to 30 steps, can agents complete all games within 30 steps?

chinganc commented 6 months ago

Hi @Jarvis-K, here we have a modified version of the MW envs, where we include a low level P-controller and the action per the env becomes either target pose of the end-effector (in absolute or relative movement). When env.step is called, the P controller would control the original metawrold env to reach the target for a fixed duration or until a certain precision is met. In this way, the problem horizon of these envs are much shorter, and yes they can be complete in less 30 steps (in fact, most problems can be solved in less than 10 steps). In our view, such a set point control version of the robot arm is also closer to how we often would use a robot arm in practice too.

Jarvis-K commented 6 months ago

Thanks for your clarification!