ODE trimesh warning on the Panda model

robotology / gym-ignition-models

Collection of robot models compatible with gym-ignition

https://github.com/robotology/gym-ignition

GNU Lesser General Public License v3.0

24 stars 7 forks source link

ODE trimesh warning on the Panda model #8

Closed paolo-viceconte closed 4 years ago

paolo-viceconte commented 4 years ago

By testing the model of the panda robot in simulation I noticed that sometimes, If its joint positions are initialized randomly within their limits, the following warning appears

ODE Message 2: Trimesh-trimesh contach hash table bucket overflow - close contacts might not be culled in AddContactToNode() [collision_trimesh_trimesh_new.cpp:226]

The very same warning occasionally appears during the simulation, when the robot reaches some configurations which visually don't seem to be inconsistent. When this happens, the robot gets stuck but the simulation doesn't stop.

We need to further investigate what causes this behaviour.

diegoferigo commented 4 years ago

I add some more context. The default physics engine of Ignition Gazebo is DART that, for the collisions detection, uses ODE. In this case the message is generated by ODE, and somehow this failure causes DART to ignore all the subsequent torque references that it receives (the simulation keeps advancing, and in @paolo-viceconte's setup there's a fixed-base whole-body controller that computes joint torques).

paolo-viceconte commented 4 years ago

We also checked the shape of the collision meshes in Gazebo and they seem to bo consistent with the visual meshes.

panda_collisions_3 panda_collissions_2

traversaro commented 4 years ago

I guess that ODE has some kind of max number of contatcs for node (see https://bitbucket.org/odedevs/ode/src/1aa0130b9b628da0791a75ae6656546d8eb4b760/ode/src/collision_trimesh_trimesh.cpp#lines-224 and https://bitbucket.org/odedevs/ode/src/1aa0130b9b628da0791a75ae6656546d8eb4b760/ode/src/collision_trimesh_opcode.h#lines-56) and with our complex collision meshes, and probably this is not a problem that it affects a lot of simulations as it is common to use primitive shapes for collision for physics simulations (small note: while SDF and URDF only distinguish between "visual" and "collision" mesh, the collision meshes that is convenient/make sense to use for motion planning are not the one that it may make sense to use for physical simulation).

However, as soon as you can find a configuration in which you can reproduce this problem simply with ign-gazebo, I would open an issue in ign-gazebo as if this limitation is in place, it should at least be properly documented.

paolo-viceconte commented 4 years ago

The very same warning occasionally appears during the simulation, when the robot reaches some configurations which visually don't seem to be inconsistent. When this happens, the robot gets stuck but the simulation doesn't stop.

By doing more trials I noticed I got this wrong. If I run one single rollout (even if extremely long) the warning does not appear and everything is fine. The problem arises instead if I run multiple rollouts one after the other: at the very beginning of each new rollout, one or more ODE trimesh warnings appear and the simulation is freezed for a variable amount of time (between one and several seconds). After that, the simulation restarts normally.

In particular, the warning is generated within the first run() of the gazebo simulator executed after the re-inizialization of the environment, i.e. after the elimination of the old robot object and the insertion of the new one.

It turns out that the warning disappears by inserting an additional gazebo.run() between the elimination of the old robot and the insertion of the new one. This seems to be required to prevent the simulator from computing collisions between the two robot objects. Even if the old robot has been deleted, it is indeed likely still present in the state of the simulator until you execute a new step.

diegoferigo commented 4 years ago

Some more context about how gym-ignition resets an environment.

Each gym-ignition Task operate on a Robot object. When the Task is reset, the robot is removed and added again into the simulation. This process is done inside the same simulation step, in other words, there is no physics step between old robot deletion and new robot insertion.

With other models, also quite complicated as the iCub model, we never experienced this ODE warning and the related simulation slow down. There's something specific of the panda model.

Anyhow, introducing a step between associated to each Task reset is not a problem, and probably it will bypass possible future related problems. I don't think that this extra step will neither interfere with simulations where multiple instances of the Task are executed in parallel in the same world.

I suspect that problem arises by internals of Gazebo. The simulator processes the removal of the model entities in the next physics step, but in the same one also adds the new model. Since the two robots are completely different (we use a name prefix to differentiate models inserted from the same SDF), depending how these operations are executed, there could be collisions problems.

paolo-viceconte commented 4 years ago

Closed via https://github.com/robotology/gym-ignition/issues/136.

FirefoxMetzger commented 3 years ago

I'll leave a comment here for future reference because this happens to be the no.1 search result on google for this warning. The above comments nicely describe the problem within ignition (and accompanying synchronization issues). I want to add some context as to why the simulation slows down.

The reason this happens is that both objects use a trimesh (a polygon mesh made solely from triangles) for collision. As shapes are arbitrary, the only general way to check for collision is to loop over all faces in one mesh and check if they intersect with faces of the other mesh; obviously an expensive operation. To speed this up, ODE seems to use a hash table under the hood to leverage data from previous iterations. However, once this is full it needs to fall back to the slow way of checking all the face pairs.

Another thing that can happen is that collision meshes may get stuck inside one another due to simulation inaccuracies, or because two models are spawned inside one another as in this case. If this happens, DART/ODE can't "unstick" the objects and they will remain in constant collision triggering trimesh-trimesh collision in every simulation step.