Closed Jarvis-K closed 6 months ago
Hi @Jarvis-K, here we have a modified version of the MW envs, where we include a low level P-controller and the action per the env becomes either target pose of the end-effector (in absolute or relative movement). When env.step is called, the P controller would control the original metawrold env to reach the target for a fixed duration or until a certain precision is met. In this way, the problem horizon of these envs are much shorter, and yes they can be complete in less 30 steps (in fact, most problems can be solved in less than 10 steps). In our view, such a set point control version of the robot arm is also closer to how we often would use a robot arm in practice too.
Thanks for your clarification!
Why the meta world env is truncated to 30 steps, can agents complete all games within 30 steps?