Farama-Foundation / Gymnasium-Robotics

A collection of robotics simulation environments for reinforcement learning
https://robotics.farama.org/
MIT License
485 stars 79 forks source link

Creation of Custom FetchObstaclePickAndPlace environment faces issue with adding obstacle data in the observation of the environment #183

Closed ChristosPeridis closed 6 months ago

ChristosPeridis commented 9 months ago

Hello dear members of the Farama Team!

I have made a fork of the Gymnasium-Robotics API. I am working on creating a custom environment, based on the 'FetchPickAndPlace-v2' environment. This new environment will introduce an obstacle in the simulation. I have created the appropriate .xml file that introduces the obstacle, which is named 'objstacle0'. For this environment I wanted to add to the original observation of the 'FetchPickAnPlace' environment the following:

  1. The position of the obstacle in the 3D space.
  2. The relative position of the robot gripper and the obstacle.
  3. The relative position of the object and the obstacle.

However, after creating the environment and performing some steps in it, the environment seems to return 0 values for the new three values added to the observation. An example can be seen in the picture bellow:

image

To further debug this behaviour I imported mujoco and I made the mujoco_utils from hidden to public attribute of the environment, so I can access the low level methods that extract the data from the MuJoCo simulator. I was surprised to see that the obstacle and the object where returning the same positions:

image

I retrieved this data by using the 'get_site_xpos()' method of the 'mujoco_utils' package. However, that is definitely not the case because from the observation we can see that the achieved goal has different value. Furthermore you can see from the image of the simulation that the object is in different position to the one of the obstacle.

image

image

Here is the overwritten version of the 'generate_mujoco_observations()' method that I am using:

def generate_mujoco_observations(self):

positions

    grip_pos = self._utils.get_site_xpos(self.model, self.data, "robot0:grip")

    dt = self.n_substeps * self.model.opt.timestep
    grip_velp = (
        self._utils.get_site_xvelp(self.model, self.data, "robot0:grip") * dt
    )

    robot_qpos, robot_qvel = self._utils.robot_get_obs(
        self.model, self.data, self._model_names.joint_names
    )
    if self.has_object:
        object_pos = self._utils.get_site_xpos(self.model, self.data, "object0")
        # rotations
        object_rot = rotations.mat2euler(
            self._utils.get_site_xmat(self.model, self.data, "object0")
        )
        # velocities
        object_velp = (
            self._utils.get_site_xvelp(self.model, self.data, "object0") * dt
        )
        object_velr = (
            self._utils.get_site_xvelr(self.model, self.data, "object0") * dt
        )
        # gripper state
        object_rel_pos = object_pos - grip_pos
        object_velp -= grip_velp
    else:
        object_pos = (
            object_rot
        ) = object_velp = object_velr = object_rel_pos = np.zeros(0)
    gripper_state = robot_qpos[-2:]

    gripper_vel = (
        robot_qvel[-2:] * dt
    )  # change to a scalar if the gripper is made symmetric

    # Extract the positions of the gripper and the object
    #grip_pos = self._utils.get_site_xpos(self.model, self.data, "robot0:grip")
    #object_pos = self._utils.get_site_xpos(self.model, self.data, "object0")

    # Calculate the obstacle's position (assuming its site name is "obstacle")
    obstacle_pos = self._utils.get_site_xpos(self.model, self.data, "obstacle0")

    # Calculate the relative positions
    gripper_to_obstacle = obstacle_pos - grip_pos
    object_to_obstacle = obstacle_pos - object_pos

    # Extend the original observations with the relative positions
    #extended_observations = observations + (gripper_to_obstacle,) + (object_to_obstacle,)        
    return (
        grip_pos,
        object_pos,
        object_rel_pos,
        gripper_state,
        object_rot,
        object_velp,
        object_velr,
        grip_velp,
        gripper_vel,
        obstacle_pos,
        gripper_to_obstacle,
        object_to_obstacle
    )

And here the code of the obstacle_pick_and_place.xml:

Could you please help me debug and fix this behaviour ?

Thank you very much in advance for all the valuable help and support !!!

Kind regards,

Christos Peridis