stack-of-tasks / pinocchio

A fast and flexible implementation of Rigid Body Dynamics algorithms and their analytical derivatives
http://stack-of-tasks.github.io/pinocchio/
BSD 2-Clause "Simplified" License
1.78k stars 375 forks source link

What is the typical computation time of CRBA? #1460

Closed rshum19 closed 3 years ago

rshum19 commented 3 years ago

Hi,

This is a question regarding the computing speed of Pinocchio. I'm developing a Operational Space Controller. However, I'm having issues in

According to the Pinocchio the technical paper. It claims that the dynamics of of different system can me computed in microseconds. For the computation of RNEA, CRBA and ABA it is computed < 10 microseconds. I'm not getting this same performance. Thus

The following piece of code takes the 1898 microseconds to compute the mass matrix of the Atlas robot (the urdf was taken from the rbd_benchmark repo)

 using namespace std::chrono;

 // Import URDF model
  Model model;
  pinocchio::urdf::buildModel(urdf_filename, model);
  Data data(model);

  // State
  Eigen::VectorXd qs(model.nq);
  qs.setZero();

  // CRBA
  auto start_crba = high_resolution_clock::now();
  crba(model, data, qs);
  auto stop_crba = high_resolution_clock::now();

  auto duration_crba = duration_cast<microseconds>(stop_crba - start_crba);
  std::cout << "CRBA = \t\t" << duration_crba.count() << std::endl;

My machine has the following specs which seem to be similar to the ones in the technical paper.

Is there any settings/configuration or compiling option that I'm might be missing the is preventing me from matching the reported computation speed? Or is the value I'm getting what is expected?

Thank you!

proyan commented 3 years ago

Hi Robert,

There are many reasons for your computer to not provide the benchmarks, the biggest one is that you are not making sure that the benchmark is being prioritized over other applications. If the computer is sharing resources, it won't provide you with a consistent performance.

In order to make sure that you are actually checking correctly the benchmark performance: 1) run htop to make sure no other applications are hogging resources (zoom, email clients, daemons, browsers, heavy desktop environments, heavy themes etc) 2) disable "Turbo-boost" (https://askubuntu.com/questions/619875/disabling-intel-turbo-boost-in-ubuntu) which is an Intel way of overcharging resources for specific applications. This introduces high variability in  the computations. 3) Take multiple samples, and take the mean over those samples. (see our benchmark folder for inspiration) 4) or, download pinocchio from source, build the benchmarks, and after ensuring 1,and 2, run to verify that you are getting similar performance. If not, there are issues in your benchmarking.

Best, Rohan On 6/14/21 10:47 PM, Roberto Shu wrote:

Hi,

This is a question regarding the computing speed of Pinocchio. I'm developing a Operational Space Controller. However, I'm having issues in

According to the Pinocchio the technical paper https://hal-laas.archives-ouvertes.fr/hal-01866228. It claims that the dynamics of of different system can me computed in microseconds. For the computation of /RNEA, CRBA/ and /ABA/ it is computed < 10 microseconds. I'm not getting this same performance. Thus

The following piece of code takes the 1898 microseconds to compute the mass matrix of the Atlas robot (the urdf was taken from the rbd_benchmark repo https://github.com/rbd-benchmarks/rbd-benchmarks/tree/master/description/urdf)

|using namespace std::chrono; // Import URDF model Model model; pinocchio::urdf::buildModel(urdf_filename, model); Data data(model); // State Eigen::VectorXd qs(model.nq); qs.setZero(); // CRBA auto start_crba = high_resolution_clock::now(); crba(model, data, qs); auto stop_crba = high_resolution_clock::now(); auto duration_crba = duration_cast(stop_crba - start_crba); std::cout << "CRBA = \t\t" << duration_crba.count() << std::endl; |

My machine has the following specs which seem to be similar to the ones in the technical paper.

  • Memory 16 Gb
  • Processor Intel Core i7-9700K CPU @ 3.60Ghz
  • GPU: GeForce RTX 2080

Is there any settings/configuration or compiling option that I'm might be missing the is preventing me from matching the reported computation speed? Or is the value I'm getting what is expected?

Thank you!

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/stack-of-tasks/pinocchio/issues/1460, or unsubscribe https://github.com/notifications/unsubscribe-auth/AC6ZDNQSDNB7A4HXVI4JJDTTSZTENANCNFSM46V6FVOQ.

jcarpent commented 3 years ago

I do think you did not compile your file using O3 option, or Release mode with CMake.

rshum19 commented 3 years ago

Thank you for the prompt response as always! Yes, compiling with the Release flag in CMake reduced the timing! It now takes 4 microseconds. Thank you @jcarpent and @proyan