dartsim / dart

DART: Dynamic Animation and Robotics Toolkit
http://dartsim.github.io/
BSD 2-Clause "Simplified" License
893 stars 286 forks source link

contact normal changes randomly between two opposite directions in Gazebo #1425

Open ClementRolinat opened 4 years ago

ClementRolinat commented 4 years ago

hello,

I am using DART as physics engine for Gazebo 9.

I found this bug when attaching a contact sensor to a model.

When I display the gazebo topic associated with this sensor, I can see that the normals of the contacts are not constant, even if the model is lying on the ground and is at rest (only gravity force is applied on it). More precisely, the directions of the normals oscillate between approximately (0, 0, 1), and (-0, -0, -1). The sign of the associated force also vary accordingly.

The bug is very easy to reproduce in Gazebo, you just need to attach a contact sensor to a model, and display the associated gazebo topic.

However, I don't know how to reproduce it with DART as a standalone, as I never used it.

I think the bug may be in dart code, as I already checked the gazebo code and didn't find the bug yet... Morevover, the contact normal behave correctly with ODE physics engine.

Does anyone has already seen this in dart ? Or at the contrary, can confirm that the contacts are working properly in dart, and this issue is coming from Gazebo ?

mxgrey commented 4 years ago

DART doesn't do collision detection itself. Instead that job is outsourced to a choice of FCL, Bullet, or ODE, with FCL being the default.

If there are collision detection errors it will either be in the underlying collision detection library or possibly in the way DART is using that library.

But something to consider: If you're doing mesh-mesh collision it's pretty common for a collision detection engine to misinterpret which direction a normal is in. I recommend using primitive shapes as much as possible to get the most reliable results.

ClementRolinat commented 4 years ago

thanks for the answer.

I'm not sure if the collision detection itself is wrong. Indeed, the physics behavior looks ok when a contact occur. It's just the returned normal and force that sometimes have the wrong sign. Maybe dart is doing some frame transforms (or other post processing) that could be wrong ?

I tested both mesh-primitive and primitive-primitive collisions in Gazebo. Both of them are ok with ODE physics engine, and both are wrong with dart physics engine.

EDIT: I looked into the dart code. Can you confirm that the default behavior (when dart is used with Gazebo) is to use the FCL collision detection in https://github.com/dartsim/dart/tree/master/dart/collision/fcl ?

in https://github.com/dartsim/dart/blob/master/dart/collision/fcl/FCLCollisionDetector.cpp, what is the difference between postProcessDART and postProcessFCL ? And which one is called by default ? I'm not sure to understand what is happening in those functions, but as they manipulate contacts, the bug may be somewhere here, no ?

mxgrey commented 4 years ago

You are correct that FCL will be the default collision detector when DART is used by Gazebo.

The post-processing is related to this comment which refers to this issue.

Basically there are known issues in FCL where garbage contact points are reported. We do some post-processing to attempt to cut down on such points, but if I remember correctly it doesn't work perfectly, and it could explain some of the issues you're experiencing with getting the incorrect normal direction.

DART also supports using Bullet collision detection and ODE collision detection. I've anecdotally found the ODE collision detection to be very reliable for my use cases with little or no surprising behavior.

Unfortunately to change DART's underlying collision detection library in Gazebo there would need to be a few changes to both sdformat and Gazebo, and that may take some time to get finished upstream by the maintainers since it's not likely to be seen as a major priority.

mxgrey commented 4 years ago

If you're okay with building Gazebo from source code, you could consider cloning the Gazebo source code and adding a line below this one to change the collision engine similar to this example.

You would just add the following to gazebo/physics/dart/DARTPhysics.cc:

// Near the top
#include <dart/collision/ode/OdeCollisionDetector.hpp>

/* .... */
// in void DARTPhysics::Load(sdf::ElementPtr _sdf)
this->dataPtr->dtWorld->getConstraintSolver()->setCollisionDetector(
        dart::collision::OdeCollisionDetector::create());

That will change the collision detection engine that DART uses within Gazebo to the ODE collision detection engine. If that continues to have the same issue whereas using the ODE physics engine does not have any problem, then I would definitely be suspicious of DART code.

ClementRolinat commented 4 years ago

I tried really hard to find the bug, I think I read almost all the Gazebo and DART collision and contact related code, but I still couldn't find anything that could lead to my issue.

So maybe the bug is only inside FCL. But if it's at such a low level, I don't understand how the physics behavior could still be good, with such inconsistant contact normal data ?

Anyway, someone with a better understanding of DART (and maybe Gazebo) framework than me will have to look at it...

If the bug is only related to FCL, a quick (but dirty) fix for me would be to compile a custom version of DART where the default behavior is to use ODE collision detection. I could then recompile Gazebo with this custom DART version. Can you tell me in which file this parameter is set ?

EDIT: ok we had the same idea at the same time ! So it's even simplier than I imagined it. I will test it as soon as possible !

mxgrey commented 4 years ago

Oh in my last post I forgot that you'll need to have Gazebo find and link to the collision-ode component of DART, which does add some complexity to things.

To do this switch from inside of DART instead, you could merge the collision-ode component library into the core DART library the way FCL already is. (See the differences between the FCL cmake and the ODE cmake). You'll need to make the ODE CMakeLists.txt look more like FCL's.

After that you'll need to make ODE the default collision detector here.

Whether you change the detector through Gazebo or through DART you'll have to do some CMake butchery. I personally think changing it in Gazebo might be easier, but I'll leave that to you to judge.

ClementRolinat commented 4 years ago

Yes I think it is easier to do the switch from Gazebo.

Concerning the cmake trick, I think it's already done here: SearchForStuff.cmake. Am I right ?

mxgrey commented 4 years ago

Oh wow yes it's in there already. I wonder why since it's never been used as far as I'm aware 🤔

But yes, that should mean that my earlier post is all you have to change in Gazebo to try out the other collision detection engines.

ClementRolinat commented 4 years ago

here a console log when I run cmake to configure the Gazebo compilation: Found DART: /usr/include (Required is at least version "6") found components: dart missing components: collision-bullet collision-ode utils-urdf

My version of DART is the binary from apt. My libdart6 and libdart6-dev packages are both on the 6.10.0 version from http://packages.osrfoundation.org/gazebo/ubuntu-stable. It seem that it doesn't have the collision-ode component in it ?

ClementRolinat commented 4 years ago

finally I compiled DART from source to make sure that collision-ode is installed.

Then, I was able to compile Gazebo (following the tutorial for installing from source). In the cmake phase it found collision-ode correctly, and finished compilation without complaining about anything.

but now, when I try to launch gazebo, I have this error: gzserver: error while loading shared libraries: libdart-collision-ode.so.6.9: cannot open shared object file: No such file or directory So I'm a little bit confused...

edit: I checked if libdart-collision-ode.so.6.9 existed, and yes, I can see it in /usr/local/lib

ClementRolinat commented 4 years ago

So, I finally have Gazebo working, with DART using ode collision detection.

But, after some testing, it is not working well:

  1. With DART engine, Gazebo is now crashing immediately (segmentation fault) when spawning a model defined by a mesh.
  2. for primitive shape, everything is not ok: the normal directions are good, but the contact positions are wrong for a cylinder (and the contact positions are good for the same cylinder when using ode physics engine).

contacts on a cylinder, in Gazebo using ODE physics engine: image

contacts on a cylinder, in Gazebo using DART physics engine, using ODE collision detection: image

i also tried DART engine with bullet collision detection. With this one, the contact normals are in the inverse direction compared to ode collision detection, so I'm not sure if it is good or not. At least Gazebo is not crashing when I import a mesh defined model, but the contact are very unstable and the computation is very very slow.

scpeters commented 4 years ago

just addressing the debian packages, libdart6-dev doesn't have all the components, I also install libdart6-collision-ode-dev and libdart6-utils-urdf-dev when I build ign-physics

azeey commented 4 years ago

There is now support for picking the collision detector for DART in Gazebo (https://bitbucket.org/osrf/gazebo/pull-requests/2956/dart-heightmap-with-bullet-and-ode/diff), but as you'll see in the PR, using ODE is not allowed because the ODE used by DART can clash with the ODE that ships with Gazebo. In my testing, it works for primitive shapes, but crashes with meshes.

ClementRolinat commented 4 years ago

ok, thanks @azeey , it explains the crashes when trying to spawn a mesh. So, I guess I'm stuck with this normal direction issue with FCL...

mxgrey commented 4 years ago

You could see if the Bullet collision detector works better, but there are currently some known issues with that one too, unfortunately.

ClementRolinat commented 4 years ago

As I said in a previous comment, I already tested bullet collision detector. The contacts are unstable (maybe related to your issue as there were contacts with ground plane), and the simulation is really slow : 1 second in real time is 0.03 second in the simulator (!!!)...

Should I report this issue (wrong contact normal direction) on FCL issue tracker ?

ClementRolinat commented 4 years ago

hello, I have some updates from the issue I posted on fcl repo : issue 422. FCL has a known issue (368) which lead two colliding objects to change order, which then cause the normal to change direction. So the question is why DART doesn't keep track of these changes, and how to fix this ?

It has been suggested to check if the pointer to colliding objects returned by a fcl contact, o1 and o2, are in a given order.

ClementRolinat commented 4 years ago

Does someone have any idea on that ?

ClementRolinat commented 4 years ago

does anyone had the time to look into it ?

ClementRolinat commented 4 years ago

I haven't the time to dig into dart, but if someone could tell me in which class/function the change should be made, I can do the coding & testing part and do a PR.

For now I am using the ODE engine of Gazebo, which has correct normal information. But Dart is way faster and stable, so I would rather use dart...

mxgrey commented 4 years ago

The only way I can think of addressing this issue, if it's really caused by the FCL broad phase issue, is modifying the FCL post-processing function to add something like a

std::unordered_map<CollisionObject*, std::unordered_map<CollisionObject*, bool>> ordering_map;

which has the following behavior:

bool o1_is_first = ordering_map[o1][o2];

When a new combination of values for o1 and o2 is found, then we insert two new entries into this map where

ordering_map[o1][o2] = true;
ordering_map[o2][o1] = false;

We'll need to store ordering_map in FCLCollisionDetector and pass a reference of it along to FCLCollisionCallbackData. Note that we need to give it a reference or a pointer to make sure that the information is preserved between calls.

One potential danger here is that this container needs to be cleared out as collision objects are deleted, because otherwise it could grow without any bound. That should probably be done here. Note that when we remove and object obj we should first iterate through the contents of ordered_map[obj] to remove the reverse map entries, e.g. ordered_map[o1][obj], ordered_map[o2][obj], ordered_map[o3][obj], etc.

ClementRolinat commented 4 years ago

thanks for your answer !

there is still something that I can't figure out about this issue. As the physics is consistent, and only the "monitoring" is impacted, it means that somewhere else in DART code, the mapping is made correctly, right ?

regarding your solution, correct me if I'm wrong, you suggest to :

mxgrey commented 4 years ago

there is still something that I can't figure out about this issue. As the physics is consistent, and only the "monitoring" is impacted, it means that somewhere else in DART code, the mapping is made correctly, right ?

If I understand the issue you were describing, it sounds like the normal vector flips direction in sync with the force magnitude flipping sign. The actual force vector that physics uses when computing contact forces will be the normal vector multiplied by the force magnitude. Therefore the sign flip of the magnitude will give a direction flip of the vector.

Perhaps one solution that could smooth this out without modifying DART would be to have the sensor plugin of Gazebo flip the sign of the normal vector when the force magnitude is negative, and always report a positive force magnitude.

if ordering_map[o1][o2] == false, flip the normal.

And also switch

  contact.collisionObject1 = static_cast<FCLCollisionObject*>(o2->getUserData());
  contact.collisionObject2 = static_cast<FCLCollisionObject*>(o1->getUserData());

and probably to other stuff, as it need to be available all the way down to convertContact

Yep, just put a reference to it into FCLCollisionCallbackData and then pass it down a few more functions after extracting it here.


But I do recommend first considering whether the tweak I recommended to the sensor plugin could solve this problem, because this order mapping solution would add a lot of hashing and map lookups which aren't otherwise necessary.

ClementRolinat commented 4 years ago

Perhaps one solution that could smooth this out without modifying DART would be to have the sensor plugin of Gazebo flip the sign of the normal vector when the force magnitude is negative, and always report a positive force magnitude.

I already thought about that. The fact is that the force is reported in the world frame, and not in the contact frame. Therefore, it is not possible to know a priori if the force is toward the object or not.

mxgrey commented 4 years ago

Is the sign of the force's magnitude supposed to represent whether the force is directed into/out of the sensor?

If so, perhaps the force sensor is making a faulty assumption about which body is object 1 vs object 2. Presumably the force sensor plugin should be able to look at the contact information and identify which of the two contact bodies is the body that the sensor is related to, and then flip the sign accordingly.

ClementRolinat commented 4 years ago

as the force is expressed in world frame, its sign is not supposed to represent the force direction relative to the object/sensor. The gazebo contact sensor just forward the contacts of a particular body to a particular gazebo topic. It does not decide which is object 1 or object 2, just forward existing information. The source is here : ContactSensor.hh, ContactSensor.cc

mxgrey commented 4 years ago

I've looked over the contact sensor code, and it confirmed what I suspected.

What I would recommend trying is changing this block of code to the following:

      double sign = 1.0;

      // If unable to find the first collision's name, try the second
      if (collIter == this->dataPtr->collisions.end())
      {
        sign = -1.0;
        collision1 = (*iter)->contact(i).collision2();
        collIter = std::find(this->dataPtr->collisions.begin(),
            this->dataPtr->collisions.end(), collision1);
      }

And then after this line you could modify the contact information by iterating through the contacts in contactMsg and multiplying every field by sign. I believe that should stabilize the signs on the message outputs.

ClementRolinat commented 4 years ago

yes I already had a similar idea. Unfortunately, the name in collision1 an collision2 doesn't flip at all. It may be because convertContact doesn't keep trace of the real ordering of colliding objects from fcl::Contact (line 1582 &1584) ?

mxgrey commented 4 years ago

I'm not sure what you mean by "keep trace of the real ordering". Typically the normal vector reported by a collision detector refers to the vector pointing from object 1 to object 2 (or vice-versa, I can't remember) starting from the point of contact.

DART will use whatever ordering is reported by FCL, and presumably the reported normal will be consistent with that.

I haven't looked at the entire pipeline of how contact points move through Gazebo, but it's hard for me to imagine that the ordering of the collision objects would be getting non-deterministically flipped.

Maybe it would be constructive to ask: How specifically are you wanting to use this contact information, and what is the current problem that you are encountering with it? I understand that the text printout on the terminal is messy, but is there a specific impact that you're seeing on any code that uses this information? Maybe this entire situation is actually a non-issue besides the jarring way the data reads out to human eyes.

ClementRolinat commented 4 years ago

I am working on robotic grasping. I use the contact information, more specifically the contact normal and positions, to compute something called the Grasp map, a matrix which depend on the contact frames positions and orientations relative to the gasped object. For the computation to be consistent, the normal need to be always pointing toward the object. When I use ode physics engine, the normal is always pointing toward the object. When I use dart physics engine, the normal is oscillating between two opposite direction, and the force is following this oscillation. for exemple, this is a terminal output, for two different timesteps (I only put the beginning of the message, with position and normal information, and not force):

  collision1: "coude::link::collision"
  collision2: "conveyor::link::collision"
  position {
    x: 1.1531399488449097
    y: -0.478406697511673
    z: 0.63999998569488525
  }
  position {
    x: 1.2135769128799438
    y: -0.49053198099136353
    z: 0.63999998569488525
  }
  position {
    x: 1.252504825592041
    y: -0.47382703423500061
    z: 0.63999998569488525
  }
  normal {
    x: -0
    y: -0
    z: -1
  }
  normal {
    x: -0
    y: -0
    z: -1
  }
  normal {
    x: -0
    y: -0
    z: -1
  }

contact {
  collision1: "coude::link::collision"
  collision2: "conveyor::link::collision"
  position {
    x: 1.1531399488449097
    y: -0.478406697511673
    z: 0.63999998569488525
  }
  position {
    x: 1.2135769128799438
    y: -0.49053198099136353
    z: 0.63999998569488525
  }
  position {
    x: 1.252504825592041
    y: -0.47382703423500061
    z: 0.63999998569488525
  }
  normal {
    x: 2.8622937353617317e-17
    y: -2.7755575615628914e-17
    z: 1
  }
  normal {
    x: 2.8622937353617317e-17
    y: -2.7755575615628914e-17
    z: 1
  }
  normal {
    x: 2.8622937353617317e-17
    y: -2.7755575615628914e-17
    z: 1
  }

This was extracted when my object, "coude" (french for elbow, as it is an elbow pipe), rely on the "conveyor" (in gazebo it is just a box).

You can see that the normal are flipping, but not collision1 and collision2.

I'm not sure what you mean by "keep trace of the real ordering". Typically the normal vector reported by a collision detector refers to the vector pointing from object 1 to object 2 (or vice-versa, I can't remember) starting from the point of contact.

Maybe it's me that doesn't understand something. In convertContact line 1582 and 1584, the information put in collisionObject1 and collisionObject2 are not linked with the information provided by fcl::Contact: if o1 and o2 from fcl::Contact flip, collisionObject1 and collisionObject2 won't flip (at least I can't figure out how).

mxgrey commented 4 years ago

if o1 and o2 from fcl::Contact flip, collisionObject1 and collisionObject2 won't flip

Whatever FCL gives us as o1 will be put into collisionObject1, and whatever FCL gives us for o2 will be put into collisionObject2 as you can see here. So if the values of o1 and o2 get flipped, then so will the values of collisionObject1 and collisionObject2. Moreover, the normal vector and force values should both flip sign as well to be consistent with this reversal of which object is o1 and which object is o2.


I think I understand the difficulty of the situation that you're in. I'm very very surprised that my recommendation here isn't working. Maybe if you can show me the changes that you made which you believe match my recommendation, then I can identify why that didn't work.

ClementRolinat commented 4 years ago

instead of

contact.collisionObject1 = static_cast<FCLCollisionObject*>(o1->getUserData());
 contact.collisionObject2 = static_cast<FCLCollisionObject*>(o2->getUserData());

I would expect

contact.collisionObject1 = fclContact.o1;
contact.collisionObject2 = fclContact.o2;

I don't understand how you can be sure that o1 and o2 from the function parameter match the ordering of o1 and o2 inside fclContact.

You suggested earlier to check in the sensor code if collision1 and collision2 flipped or not. But as you can see on the terminal output, they don't flip, even if the normal change direction. Do you suggest that the terminal output doesn't represent the real state of the data (so a printing bug ?).

ClementRolinat commented 4 years ago

oh, I just noticed that in the sensor there is some swapping between collision1 and collision2. I'm sorry, I didn't fully understood your suggestion. It should work ! I will test that as soon as possible, and keep you updated ! thanks !

mxgrey commented 4 years ago

To be honest I'm very confused by what's happening. I'm sorry for not noticing that the printout keeps showing the object names in a consistent order; that does change my suspicions.

there is some swapping between collision1 and collision2

As I dig more into exactly what that code is doing, I think that should just make the sensor robust to whether the object of interest is specified as the first or second object in the contact data. I don't see any place in the sensor plugin code where the names would actually get swapped within the contact data.


I'm honestly very perplexed about this situation, and it's making me wonder if you're onto something about the assumption that o1 and o2 in the function argument will match the o1 and o2 in the contact data. For debugging purposes, it might be worth putting an if-statement here to check and scream out if there is an inconsistency.

Somewhere in the pipeline someone is definitely swapping the object names in order for the contact data to print out like that :dizzy_face:

ClementRolinat commented 4 years ago

[...] I think that should just make the sensor robust to whether the object of interest is specified as the first or second object in the contact data. I don't see any place in the sensor plugin code where the names would actually get swapped within the contact data.

I thought there was swapping here. But now after looking at it more closely, the contact is not modified, and as you said it seems to be a way to make the sensor robust to where the object of interest appear. But still it is a confusing way to do it in my opinion...

[...] o1 and o2 in the function argument will match the o1 and o2 in the contact data. For debugging purposes, it might be worth putting an if-statement here to check and scream out if there is an inconsistency.

This could be tricky because o1 and o2 in fcl::Contact are not of the same type as o1 and o2 from function parameters (see here). Maybe the convertContact implementation is like that because of this difference.

I'm still far from fully understanding this whole part of the code (how the pipeline from fcl to dart is working). Do you know who wrote the FCLCollisionDetector ?

mxgrey commented 4 years ago

I believe it was @jslee02 that implemented the FCLCollisionDetector class.

mxgrey commented 4 years ago

This could be tricky because o1 and o2 in fcl::Contact are not of the same type as o1 and o2 from function parameters

I believe you can do

if (o1->collisionGeometry().get() == fclContact.o1)

based on this API and this API.

ClementRolinat commented 4 years ago

ok thanks, I will try to look into it soon !

trns1997 commented 3 years ago

Hello guys stuff here looks fairly recent, and I also want to get ODE collision detection working with DART. I have made the following issue https://github.com/dartsim/dart/issues/1535#issue-795263195 . But I think the ODE used in DART conflicts with the ODE version used in gazebo. Is there a way to resolve this issue. I really want to avoid using FCL and bullet gives very poor results.

axkoenig commented 3 years ago

Hi everyone,

I have a similar issue as @ClementRolinat in that I work on robotic grasping and I am computing metrics based on the contact normals and forces. I am experiencing a very weird and, I think, related issue with Gazebo11 + DART6 + Bullet collision checking (Side note: I am using Bullet collision checking since ODE and FCL lead to weird "sticking" behavior and DART collision checking didn't seem to be properly integrated into Gazebo yet).

My problem is that in different instances of my Gazebo simulation the contact normals (and forces) point in different directions. For example, in simulation 1 I am touching a cylinder with the finger and the contact normal points towards the object. In simulation 2 the contact normal and force points the other way. In one simulation session the contact normals seem to consistently point one way and don't change direction (like in @ClementRolinat's post).

Does anyone have a clue how to solve this? @ClementRolinat did you find a solution in the end?

Simulation 1 out_of_finger

Simulation 2 into_finger

Cheers, Alex

trns1997 commented 3 years ago

Hey @axkoenig not sure this will help, but try checking the collision order. For example just add a debug statement:

std::cout << "Collision between[" << contacts.contact(i).collision1()
                      << "] and [" << contacts.contact(i).collision2() << "]\n";

If my guess is correct you will see that the contact order switches between simulations. Which means when you try to check for for forces on body_1 between simulations the contact body switches. Worth a check. But as far as why the forces are not shown in both directions, I am not entirely sure.

Hope this helps,

Thomas

ClementRolinat commented 3 years ago

Hello @axkoenig, sorry I don't have any solution, I finally decided to stay with the classic ode engine and don't use dart at all. Your issue seems a bit different from mine as you use bullet collision detection instead of the default fcl, and that your normal direction seems consistent across a given simulation instance (while in my case the direction oscillated from one timestep to an other).

axkoenig commented 3 years ago

Update: I noticed that the contact normal direction changes depending on which object is spawned first in Gazebo:

Also note that @trns1997's solution works! Cheers for that!

trns1997 commented 3 years ago

Hey @ClementRolinat, I think I recently faced the problem you were having, I ran the same sim with 3 different collision detectors (FCL, Bullet and Dart) and here are the results: contact_forces

Collision between[my_model::box_link::collision] and [ground_plane::link::collision]
TotalForce: -1.701280
Collision between[my_model::box_link::collision] and [ground_plane::link::collision]
TotalForce: 1.701280

I know not super helpful. FCL does seem to continuously alternate but with the same magnitude but weirdly the contact order doesn't seem to change (atleast from the plugins POV). But I was just wondering is it "really wrong" to just absolute the contact force (I guess it would depend on the use case). But either way if someone has any idea on why this is happening it would be cool to resolve it.