Slow simulation speeds for composite

amandlek commented 5 years ago

Using a composite to try and model a piece of cloth is leading to extremely slow simulation speeds. To contextualize this, we would like to see whether we can implement a simple environment with our Sawyer robot from robosuite and a piece of cloth on the table. I've posted a script below for testing the functionality - it simply reads the xml (which is taken from our SawyerLift environment), adds in a composite using the composer, then creates an environment from the final model. To run it, you should have robosuite installed.

from dm_control import mjcf
from dm_control import composer
import numpy as np
import sys
from dm_control import viewer

import robosuite as suite
env = suite.make(
        'SawyerLift',
        has_renderer=False,
        has_offscreen_renderer=False,
        ignore_done=True,
        use_camera_obs=False,
        control_freq=100,
    )
env.reset()
env_xml = env.model.get_xml()

cloth = """
<mujoco model="parent">
    <worldbody>
        <composite type="grid" count="9 9 1" spacing="0.05" offset="0.7 0 1.0">
                <geom size=".02"/>
        </composite>
    </worldbody>
</mujoco>"""

world = mjcf.from_xml_string(env_xml)
cloth = mjcf.from_xml_string(cloth)

class MyEntity(composer.ModelWrapperEntity):
    def _build(self, mjcf_model):
        self._mjcf_model = mjcf_model
        self._mjcf_root = mjcf_model

cloth_entity = MyEntity(cloth)
world_entity = MyEntity(world)

world_entity.attach(cloth_entity)

task = composer.NullTask(world_entity)
task.control_timestep = 0.02
env = composer.Environment(task)

viewer.launch(env)

alimuldal commented 5 years ago

I don't think this is an issue with the Python libraries per se - composite bodies are just inherently more expensive to simulate than simple rigid bodies since they impose a heavier load on MuJoCo's constraint solver. To profile your code I replaced the call to viewer.launch with a trivial run loop:

action_spec = env.action_spec()
null_action = np.zeros(action_spec.shape, action_spec.dtype)
num_steps = 1000
for _ in range(num_steps):
    env.step(null_action)

Using cProfile we can see that almost all of the time is spent in the Physics.step method: Drilling down a bit further using line_profiler, we see that almost all of the time is spent calling the mj_step2 function in the ctypes-wrapped MuJoCo library:

Total time: 46.1023 s                                                                                                               
File: /usr/local/google/home/alimuldal/.venvs/robosuite/lib/python3.6/site-packages/dm_control/mujoco/engine.py                     
Function: step at line 125                                                                                                          

Line #      Hits         Time  Per Hit   % Time  Line Contents                                                                      
==============================================================                                                                      
   125                                             @profile                                                                         
   126                                             def step(self):                                                                  
   127                                               """Advances physics with up-to-date position and velocity dependent fields.    
   128                                                                                                                              
   129                                               The actuation can be updated by calling the `set_control` function first.     
   130                                               """
   131                                               # In the case of Euler integration we assume mj_step1 has already been        
   132                                               # called for this state, finish the step with mj_step2 and then update all    
   133                                               # position and velocity related fields with mj_step1. This ensures that       
   134                                               # (most of) mjData is in sync with qpos and qvel. In the case of non-Euler    
   135                                               # integrators (e.g. RK4) an additional mj_step1 must be called after the      
   136                                               # last mj_step to ensure mjData syncing.                                      
   137      9990     320213.0     32.1      0.7      with self.check_invalid_state():                                              
   138      9990      93070.0      9.3      0.2        if self.model.opt.integrator == enums.mjtIntegrator.mjINT_EULER:            
   139      9990   42418420.0   4246.1     92.0          mjlib.mj_step2(self.model.ptr, self.data.ptr)                             
   140                                                 else:
   141                                                   mjlib.mj_step(self.model.ptr, self.data.ptr)                              
   142
   143      9990    3270555.0    327.4      7.1        mjlib.mj_step1(self.model.ptr, self.data.ptr)

mj_step2 is where MuJoCo calculates constraint forces (see here). Using the profiling tool built into the simulate binary that comes with MuJoCo, we can see that this accounts for most of the simulation cost:

amandlek commented 5 years ago

@alimuldal Thank you so much for the informative and detailed reply! Since this is limited by the constraint solver, it doesn't seem like there's anything that can be done to speed this up, besides limiting the grid of the composite. Do you have any other recommendations for making any reasonable approximation to a cloth work while maintaining reasonable simulation speeds?

alimuldal commented 5 years ago

Some obvious things to try:

Increase your simulation timestep by as much as you can get away with
Simplify your composite body if you can
Try using the conjugate gradient solver rather than the default Newton solver (e.g. world.option.set_attributes(solver='CG'). The CG solver is much faster (I see about a 3x speed up for your example), but you may find that it converges less reliably.

MuJoCo's simulate binary is quite useful for playing around with solver parameters and debugging performance issues. You can use mjcf.export_with_assets(mjcf_model, '/some/path/') to export your MJCF model in a format that can be easily loaded by simulate.

google-deepmind / dm_control

Slow simulation speeds for composite #83