[Bug Report] AdroitHandRelocate set env state is unstable when using states where the ball is being grasped.

StoneT2000 commented 1 year ago

If you are submitting a bug report, please fill in the following details and use the tag [bug].

Describe the bug It seems mujoco is a little unstable when it comes to setting state when active manipulation + contact is going on between the hand and a non-articulated actor like the sphere. (not an issue in AdroitDoor, and surprisingly not an issue it appears in AdroitPen).

In AdroitHandRelocate, setting env state to a state where the ball is grasped by the hand causes some instability and the ball will teleport out / penetrate the hand.

Code example Take a succesfull trajectory from minari, replay it. Call env.get_env_state() to get the final state, then reset using the final state. The ball will usually teleport out. There is also another "bug" where the output of get_env_state includes 2 extraneous parts hand_qpos and palm_pos which are not part of the state space.

System Info Ubuntu, Gymnasium 0.29, Python 3.9

Checklist

[x] I have checked that there is no similar issue in the repo (required)

Solution?

I got around this by setting state twice. The first time I set I note the difference in ball position between where it actually got set to (after the one step of mujoco sim which self.set_state(qp, qv) does), and then add that difference to the state I wanted to set to

def set_env_state(self, state_dict):
        """
        Set the state which includes hand as well as objects and targets in the scene
        """
        assert self._state_space.contains(
            state_dict
        ), f"The state dictionary {state_dict} must be a member of {self._state_space}."
        qp = state_dict["qpos"]
        qv = state_dict["qvel"]

        self.model.body_pos[self.obj_body_id] = state_dict["obj_pos"]
        self.model.site_pos[self.target_obj_site_id] = state_dict["target_pos"]
        self.set_state(qp, qv)
        diff = self.model.body_pos[self.obj_body_id] - self.data.xpos[self.obj_body_id]
        # print(diff, "desired", self.model.body_pos[self.obj_body_id], "achieved", self.data.xpos[self.obj_body_id])  
        self.model.body_pos[self.obj_body_id] = state_dict["obj_pos"] + diff
        self.set_state(qp, qv)
        # print(diff, "desired", self.model.body_pos[self.obj_body_id], "achieved", self.data.xpos[self.obj_body_id])

rodrigodelazcano commented 1 year ago

Hey @StoneT2000 sorry for the late reply and thanks a lot for finding this. I was able to replicate the bug and reached your same conclusion in the origin of the issue (at the beginning I thought it had something to do with different frames of reference but everything seems to be alright regarding this).

Your solution looks good to me, I can't think of a simpler way of fixing this. Do you have the time to make a PR?

StoneT2000 commented 1 year ago

I can make a PR implementing this fix however i'm not particularly very happy about it lmao. Maybe bring this up to the Mujoco repo and see if they know why this might happen? Seems odd this would work. Not sure how much Mujoco people are involved/knowledgeable about these environments however

rodrigodelazcano commented 1 year ago

Yeap, I agree it's kind of hacky and I'm impressed how you were able to find this solution. This is definitely something we should report to the Mujoco team, I just wanted to run some more test in case we are missing something (we also don't collaborate with them developing these environments).

In the meantime I will approve a PR with this fix since this solution won't affect much the environment's performance and I don't know how much time it will take us to reach a more accurate solution.

Kallinteris-Andreas commented 11 months ago

I believe the source of the issue is that the self.model is modified, for what I can tell is no reason

@rodrigodelazcano do you happen to have any insight as to why self.model is being modified?

Farama-Foundation / Gymnasium-Robotics

[Bug Report] AdroitHandRelocate set env state is unstable when using states where the ball is being grasped. #165

Checklist

Solution?