leggedrobotics / SimBenchmark

Physics engine benchmark for robotics applications: RaiSim vs Bullet vs ODE vs MuJoCo vs DartSim
https://leggedrobotics.github.io/SimBenchmark/
199 stars 39 forks source link

Software errors with dartsim? #9

Open mxgrey opened 6 years ago

mxgrey commented 6 years ago

Thanks for creating this benchmark! It's an incredibly valuable service to the simulation community to be able to compare existing solutions in terms of performance and accuracy.

As a dartsim developer, I see these benchmark results as an opportunity to identify where dartsim could be improved.

I'm especially concerned about "cannot simulate due to inaccurate model or excepted" and "simulation fails due to software error" results.

To figure out what went wrong with those tests, I've cloned your repo and tested the benchmarks on my own machine (specifically DartRollingBenchmark, Dart666Benchmark, and DartElastic666Benchmark which were reported as impossible to run). I got the following results for these benchmarks:

[08:02:15:50:24 RollingBenchmark.cpp:179] 
CPU time   : 2.56495
mean error : 0.425893
speed (Hz) : 1559.48
[08:02:15:49:03 666Benchmark.cpp:228] 
CPU time   : 13.9889
mean error : 33302.2
[08:02:15:52:06 Elastic666Benchmark.cpp:195] 
CPU time   : 7.66768
mean error : 538642

None of these tests seemed to crash or error out (as far as I can tell), so would you be able to elaborate on what those two categories refer to? Or is there some other diagnostic information I should be looking at?

It's probably worth noting that I needed to run these benchmarks with the --nogui option, because running the GUI with any of the benchmarks results in a segfault inside of the rai_graphics::Shader_basic constructor (it seems a std::string constructor is hitting a logic error). If the GUI is providing important information, then I can't currently access that. If you happen to know what might be causing this segfault, I'd appreciate any tips on how to fix it.

mxgrey commented 6 years ago

Running the BtMb666Benchmark, I can see the stark difference in the error value of the result:

[08:02:16:08:18 666Benchmark.cpp:211] 
CPU time   : 10.8415
mean error : 0.38247

33302.2 for dartsim vs 0.38247 for Bullet.

So I'll assume that something very wrong is happening in the dartsim simulation which I would be able to clearly see if the GUI were usable. I'll try to use dartsim's GUI utilities to view the benchmark results unless you happen to know how I can get the rai GUI working.

jhwangbo commented 6 years ago

Hi Michael, I just pushed a commit that fixes your issue with the graphics. You have to add the environmental variable that points to graphics folder, which holds shaders and a few images by something like

echo 'export RAI_GRAPHICS_OPENGL_ROOT='$PWD/raiGraphics'' >> ~/.bashrc

Could you give us the exact Dart installation options that you used?

mxgrey commented 6 years ago

Thanks, that pushed helped, and now I can see the visualizations.

It seems that for 666 and Elastic666, the balls explode somewhat spectacularly, so presumably the contact handler is being way too aggressive. If we provided options to tune CFM and ERP like what was requested here it probably wouldn't be too hard to make those tests work decently.

But the Rolling benchmark seemed to work okay for me. Qualitatively, it looked similar to the results for the bullet version, although it does seem that the error is being computed as much higher. Are the rolling balls slipping too much? Or not slipping enough?

To clarify, when the report says "cannot simulate due to inaccurate model or excepted" or "simulation fails due to software error", does that mean that the computed error of the benchmark is so high that you've concluded it's invalid? Or does it mean that the application crashed in some way when you tried to run it?

Could you give us the exact Dart installation options that you used?

I've been testing with the release-6.4 branch compiled from source on Ubuntu 16.04. My choice of release branch was arbitrary, so I'd be happy to try it with a different branch if there's a specific branch or release that you were using. Unfortunately, I can't test against the debian package, because the FCL package that our deb depends on is in conflict with another package (libfcl-0.5-dev) that I need for some other projects.

In case it's relevant, here are my computer specs:

jhwangbo commented 6 years ago

Elastic666 tests the energy conservation. So it will increase the error so much if we turn on CFM and ERP. We also disable these features in RaiSim for testing as well. It is meant to explode somewhat since it has coefficient of restitution = 1. Here is a test shot from RaiSim

https://leggedrobotics.github.io/SimBenchmark/elastic666/index.html

The rolling benchmark is designed such that it doesn't move if the friction cone constraint is approximated by a pyramid. So it shouldn't work with a LCP solver (if it is a LINEAR complementarity problem solver). The default push direction is Y, but we test it with XY. I updated the default to XY force. Now you will see that nothing is moving in Dart.

All Dart errors were segmentation faults. But now I checked with a different machine, Dart works fine. We will investigate this issue and give you an update.

We also test with release-6.4 and ubuntu 16.04. The test machine has only 16Gb. Maybe this can be an issue but we will sort this out.

eastskykang commented 6 years ago

@mxgrey Hi Michael, thank you so much for your valuable feedback.

There was an error in the benchmark result. At the beginning we tested with more ball-shaped objects but reduced the number since the simulation takes too much time. I didn't updated Dart's result with smaller number of objects. Sorry for misleading. I am trying to reproduce segmentation fault error now. I will let you know if there's an update.

Here, I share my result of 666 test and elastic-666 test with DART.


  1. As I run 666 test again, I got the following results:
    DANTZIG dt=0.001
    =======================
    CPU time   : 24.5724
    mean error : 0.000282356
PGS dt=0.001
=======================
CPU time   : inf
mean error : 37.3378

As I checked the system with GUI, I found the stack of the balls explods with PGS solver. It shows similar results with different timestep dt=0.0001, dt=0.004, dt=0.001, dt=0.004, dt=0.01, dt=0.04, dt=0.1. Also, please note that Dantzig solver error increases exponentially as the number of balls increases. As I tested with 10 x 10 x 10 balls (n=10), the result is

DANTZIG dt=0.001
=======================
CPU time   : 145.548
mean error : 1.30494e+11

The simulation is blown up with n=10 with Dantzig solver. You can see the video as you click the following image :

dart

This is not the case for other engines. For instance, Bullet's mean penetration errors are 0.38247 (n=6) and 5519.51 (n=10) and ODE Dantzig has 0.000154633 (n=6) and 0.000690886 (n=10)


  1. As I run elastic 666 test, I got the following results:
DANTZIG

dt=0.0001 / error=2438.45
dt=0.0004 / error=52414.7
dt=0.001  / error=346909
dt=0.004  / error=6.62224e+06
dt=0.01   / error=1.28706e+10

PGS
dt=0.0001 / error=2407.45
dt=0.0004 / error=52421.4
dt=0.001  / error=346280
dt=0.004  / error=6.60756e+06
dt=0.01   / error=nan

  1. Rolling Benchmark is for investigating friction cone shape and accuracy of the simulation with many slip-contacts. You can find that applied force in diagonal direction makes no motion. You can test it by giving --force=xy flag.

I will change the plot and the benchmark report of 666 and elastic666 test as soon as possible.

eastskykang commented 6 years ago

Please see https://leggedrobotics.github.io/SimBenchmark/666/index.html and https://leggedrobotics.github.io/SimBenchmark/elastic666/index.html for updated DART benchmark results from 666 and elastic 666 tests.