rail-berkeley / rlkit

Collection of reinforcement learning algorithms
MIT License
2.43k stars 547 forks source link

Position Control with mujoco-py #161

Open bara-bba opened 2 years ago

bara-bba commented 2 years ago

Hi everyone! I would like to control a robotic EE in position only, so I wrote in the actuator part of XML code and all the magic stuff needed. The problem I encounter is that during RL the action is sampled in the ctrlrange but I would like to have the whole joint space while keeping a limited action sample. Is that any way to solve this stuff? Thanks!

THE XML FILE

THE ENVIRONMENT

    import numpy as np
    from gym import utils
    from gym.envs.mujoco import mujoco_env

    class PandaEnv(mujoco_env.MujocoEnv, utils.EzPickle):

        def __init__(self):
            utils.EzPickle.__init__(self)
            mujoco_env.MujocoEnv.__init__(self, "/home/bara/doc/rlkit/generic/panda.xml", 2)

        def step(self, a):

            # STEP
            self.do_simulation(a, self.frame_skip)

            #DISTANCE
            xpos_component = self.get_body_com("component")
            xpos_target = self.get_body_com("target")
            dist = xpos_component - xpos_target

            #REWARD
            reward_dist = -np.linalg.norm(dist)
            reward = reward_dist  # More contributions to rewards may be added
            # print("REWARD: " + str(reward))
            # print("ACTION: " + str(a))

            ob = self._get_obs()
            done = False

            return ob, reward, done, dict(reward_dist=reward_dist)

        def viewer_setup(self):
            self.viewer.cam.trackbodyid = 0

        def reset_model(self):
            qpos = np.zeros(self.model.nq)
            qvel = np.zeros(self.model.nv)
            self.set_state(qpos, qvel)
            return self._get_obs()

        def _get_obs(self):
            return np.concatenate(
                [
                    self.get_body_com("component"),
                    self.get_body_com("target")
                ]
            )