google-deepmind / dm_control

Google DeepMind's software stack for physics-based simulation and Reinforcement Learning environments, using MuJoCo.
Apache License 2.0
3.82k stars 669 forks source link

About the action of Jaco-reach #447

Closed yingchengyang closed 9 months ago

yingchengyang commented 10 months ago

I'm curious about the action of Jaco-reach (https://github.com/google-deepmind/dm_control/blob/main/dm_control/manipulation/reach.py). Whether the actions are directly related to the position of the robot and thus are not related to some attributes of the robot? For example, if I change the mass or damping of the robot (by changing env._physics.model.body_mass or env._physics.model.dof_damping), it seems that the environments are the same.

Thanks a lot!

yuvaltassa commented 9 months ago

You are probably not changing them enough or changing them and then they get overwritten. Changing the physical properties of the robot will definitely change the behaviour.

yingchengyang commented 8 months ago

Thanks a lot! I hope to create two reach environments with different masses. I directly change the variable env._physics.model.body_mass, but it seems that I do not successfully change its mass as the return after stepping one action is the same as below:

>>> from dm_control import manipulation
>>> import numpy as np
>>> env_a = manipulation.load('reach_duplo_vision', seed=0)
>>> env_b = manipulation.load('reach_duplo_vision', seed=0)
>>> env_b._physics.model.body_mass = 100.0 * env_b._physics.model.body_mass
>>> obs_a = env_a.reset()
>>> obs_b = env_b.reset()
>>> action_spec = env_a.action_spec()
>>> action = np.random.uniform(action_spec.minimum, action_spec.maximum, size=action_spec.shape)
>>> obs_a = env_a.step(action)
>>> obs_b = env_b.step(action)
>>> print(obs_a)

         [ 61,  92, 122],
         [ 61,  91, 122],
         [ 61,  91, 121]],

        [[ 30,  60,  90],
         [ 30,  60,  90],
         [ 30,  61,  91],
         ...,
         [ 61,  91, 122],
         [ 61,  91, 122],
         [ 60,  90, 121]],

        [[ 30,  60,  90],
         [ 30,  60,  90],
         [ 30,  61,  91],
         ...,
         [ 61,  91, 122],
         [ 61,  91, 121],
         [ 60,  90, 121]]]], dtype=uint8)), ('jaco_arm/joints_pos', array([[[ 0.99019207, -0.13971277],
        [ 0.32524643, -0.94562929],
        [-0.99679483,  0.08000038],
        [ 0.37522412,  0.92693412],
        [-0.97989668,  0.19950564],
        [-0.86523415,  0.501368  ]]])), ('jaco_arm/joints_torque', array([[-5.17356959e-02,  2.01732767e+00, -1.49492643e+00,
        -4.86451538e-01,  2.67781563e-02,  1.53640458e-03]])), ('jaco_arm/joints_vel', array([[ 0.14820925, -0.10040166, -0.21730423,-0.10535219, -0.11907987,
         0.11611988]])), ('jaco_arm/jaco_hand/joints_pos', array([[1.00143412, 1.01505313, 1.01504966]])), ('jaco_arm/jaco_hand/joints_vel', array([[-0.30883389,  0.31029122,  0.31021528]])), ('jaco_arm/jaco_hand/pinch_site_pos', array([[0.01938608, 0.00727906, 0.28493621]])), ('jaco_arm/jaco_hand/pinch_site_rmat', array([[ 4.02349688e-03,  9.99987751e-01,  2.88256540e-03,
         9.99991895e-01, -4.02305274e-03, -1.59858894e-04,
        -1.48260224e-04,  2.88318522e-03, -9.99995833e-01]]))]))
>>> print(obs_b)

         [ 61,  92, 122],
         [ 61,  91, 122],
         [ 61,  91, 121]],

        [[ 30,  60,  90],
         [ 30,  60,  90],
         [ 30,  61,  91],
         ...,
         [ 61,  91, 122],
         [ 61,  91, 122],
         [ 60,  90, 121]],

        [[ 30,  60,  90],
         [ 30,  60,  90],
         [ 30,  61,  91],
         ...,
         [ 61,  91, 122],
         [ 61,  91, 121],
         [ 60,  90, 121]]]], dtype=uint8)), ('jaco_arm/joints_pos', array([[[ 0.99019207, -0.13971277],
        [ 0.32524643, -0.94562929],
        [-0.99679483,  0.08000038],
        [ 0.37522412,  0.92693412],
        [-0.97989668,  0.19950564],
        [-0.86523415,  0.501368  ]]])), ('jaco_arm/joints_torque', array([[-5.17356959e-02,  2.01732767e+00, -1.49492643e+00,
        -4.86451538e-01,  2.67781563e-02,  1.53640458e-03]])), ('jaco_arm/joints_vel', array([[ 0.14820925, -0.10040166, -0.21730423,-0.10535219, -0.11907987,
         0.11611988]])), ('jaco_arm/jaco_hand/joints_pos', array([[1.00143412, 1.01505313, 1.01504966]])), ('jaco_arm/jaco_hand/joints_vel', array([[-0.30883389,  0.31029122,  0.31021528]])), ('jaco_arm/jaco_hand/pinch_site_pos', array([[0.01938608, 0.00727906, 0.28493621]])), ('jaco_arm/jaco_hand/pinch_site_rmat', array([[ 4.02349688e-03,  9.99987751e-01,  2.88256540e-03,
         9.99991895e-01, -4.02305274e-03, -1.59858894e-04,
        -1.48260224e-04,  2.88318522e-03, -9.99995833e-01]]))]))

So how can I change the mass of a reach robot to change its dynamic? Thanks a lot!