Farama-Foundation / Gymnasium-Robotics

A collection of robotics simulation environments for reinforcement learning
https://robotics.farama.org/
MIT License
529 stars 85 forks source link

[Bug Report] False information drawn from MuJoCo simulator while using the Gymnasium-Robotics API's MuJoCo utility functions for custom FetchPObstaclePickAndPlace environment. #186

Closed ChristosPeridis closed 9 months ago

ChristosPeridis commented 12 months ago

Hello dear members of the Farama Team!

I have already address this as an issue in #183 and creating this new issue as a kind reminder. I have made a fork of the Gymnasium-Robotics API. I am working on creating a custom environment, based on the 'FetchPickAndPlace-v2' environment. This new environment will introduce an obstacle in the simulation. I have created the appropriate .xml file that introduces the obstacle, which is named 'objstacle0'. For this environment I wanted to add to the original observation of the 'FetchPickAnPlace' environment the following:

  1. The position of the obstacle in the 3D space.
  2. The relative position of the robot gripper and the obstacle.
  3. The relative position of the object and the obstacle.

However, after creating the environment and performing some steps in it, the environment seems to return 0 values for the new three values added to the observation. An example can be seen in the picture bellow:

image

To further debug this behaviour I imported mujoco and I made the mujoco_utils from hidden to public attribute of the environment, so I can access the low level methods that extract the data from the MuJoCo simulator. I was surprised to see that the obstacle and the object where returning the same positions:

image

I retrieved this data by using the 'get_site_xpos()' method of the 'mujoco_utils' package. However, that is definitely not the case because from the observation we can see that the achieved goal has different value. Furthermore you can see from the image of the simulation that the object is in different position to the one of the obstacle.

image

image

Here is the overwritten version of the 'generate_mujoco_observations()' method that I am using:

def generate_mujoco_observations(self):

positions

    grip_pos = self._utils.get_site_xpos(self.model, self.data, "robot0:grip")

    dt = self.n_substeps * self.model.opt.timestep
    grip_velp = (
        self._utils.get_site_xvelp(self.model, self.data, "robot0:grip") * dt
    )

    robot_qpos, robot_qvel = self._utils.robot_get_obs(
        self.model, self.data, self._model_names.joint_names
    )
    if self.has_object:
        object_pos = self._utils.get_site_xpos(self.model, self.data, "object0")
        # rotations
        object_rot = rotations.mat2euler(
            self._utils.get_site_xmat(self.model, self.data, "object0")
        )
        # velocities
        object_velp = (
            self._utils.get_site_xvelp(self.model, self.data, "object0") * dt
        )
        object_velr = (
            self._utils.get_site_xvelr(self.model, self.data, "object0") * dt
        )
        # gripper state
        object_rel_pos = object_pos - grip_pos
        object_velp -= grip_velp
    else:
        object_pos = (
            object_rot
        ) = object_velp = object_velr = object_rel_pos = np.zeros(0)
    gripper_state = robot_qpos[-2:]

    gripper_vel = (
        robot_qvel[-2:] * dt
    )  # change to a scalar if the gripper is made symmetric

    # Extract the positions of the gripper and the object
    #grip_pos = self._utils.get_site_xpos(self.model, self.data, "robot0:grip")
    #object_pos = self._utils.get_site_xpos(self.model, self.data, "object0")

    # Calculate the obstacle's position (assuming its site name is "obstacle")
    obstacle_pos = self._utils.get_site_xpos(self.model, self.data, "obstacle0")

    # Calculate the relative positions
    gripper_to_obstacle = obstacle_pos - grip_pos
    object_to_obstacle = obstacle_pos - object_pos

    # Extend the original observations with the relative positions
    #extended_observations = observations + (gripper_to_obstacle,) + (object_to_obstacle,)        
    return (
        grip_pos,
        object_pos,
        object_rel_pos,
        gripper_state,
        object_rot,
        object_velp,
        object_velr,
        grip_velp,
        gripper_vel,
        obstacle_pos,
        gripper_to_obstacle,
        object_to_obstacle
    )

And here the code of the obstacle_pick_and_place.xml:

Could you please help me debug and fix this behaviour ?

Thank you very much in advance for all the valuable help and support !!!

Kind regards,

Christos Peridis

Kallinteris-Andreas commented 9 months ago
  1. do not create 2 issues for the same problem (unless you close the first one)
  2. xpos is not the same as qpos
  3. this is an issue for the MuJoCo team, it does not have anything to do with the Gymansium-robotics API