Farama-Foundation / Metaworld

Collections of robotics environments geared towards benchmarking multi-task and meta reinforcement learning
https://metaworld.farama.org/
MIT License
1.28k stars 275 forks source link

Question of the reward of button press task. #433

Closed RyanLiu112 closed 1 year ago

RyanLiu112 commented 1 year ago

Hi,

I use both the scripted policy in the Meta-World and the expert policy trained by SAC to rollout trajectories in the Button Press task. But the near_object flag in the returned info never turns True. When I try Drawer Open task, the near_object turns True when the arm approaches the drawer. I find that the near_object flag in the Drawer Open task is computed according the position of the hand of the arm, but the near_object in the Button Press task is computed based on the position of the center of leftEndEffector and rightEndEffector. What's the difference of these two positions? Is it more proper to use the position of the hand of the arm to compute near_object in the Button Press task?

reginald-mclean commented 1 year ago

Sorry for the delay. My thinking would be that for the drawer open task, the arm needs to grip the handle to open the drawer so near object is set up in a way that, in the reward computation, the agent is guided to grip the handle via the distance between the drawer handle and the hand. The button press top down task would not require gripping, and therefore the arm can keep the gripper closed and press the button with the tip(s) of the end effector "fingers." I hope this helps.

RyanLiu112 commented 1 year ago

Thanks for your response and it is clear. near_object flag of the button press task should be fixed since it never turns True using the scripted policy.