First benchmark Exponential vs Euler

andreadelprete commented 4 years ago

I've carried out the first rigorous test to compare Exponential and Euler. I describe the settings and show the results in this issue.

Description

robot: quadruped Solo
motion: standing on four feet, following sinusoidal trajectory on CoM vertical direction
controller: TSID with small gains (kp=2)
controller time step: 10 ms
total simulation time: 0.1 s

I computed the ground truth using either Exponential or Euler with a very small time step (i.e. about 10 us), and then I measured the error between the ground-truth state trajectories and the ones computed by Exponential and Euler using larger time steps. Time steps have been chosen starting from the controller time step (10 ms) and dividing it by powers of 2, starting from 1 up to 1024 (which gives the time step used to compute the ground truth). The power of two by which we divide the controller time step to get the simulation time step is called ndt, and it appears in the X axis of the plots below.

Since choosing one of the two methods to compute the ground truth does not seem fair, I repeated the test twice, one using Euler for the ground truth, one using Exponential for the ground truth. The results however do not change much, which makes sense because we've already seen that for small time steps the two approaches converge.

Results

These are the results using Exponential for the ground truth. consim_integration_error_using_exp_as_ground_truth

These are the results using Euler for the ground truth. consim_integration_error_using_euler_as_ground_truth

Comments

The plots show two errors: the final error is the error in the final state, whereas the total error (more interesting) is the error throughout the whole trajectory
The results in the two plots are very similar, except for the final point on the right, where of course the method that has been used for the ground truth has an advantage, but that's obvious and uninteresting.
There are no results of Euler for ndt smaller than 128 because it always went unstable, whereas Exponential remained stable up to ndt=1
For any given time step, Exponential has a smaller error than Euler

andreadelprete commented 4 years ago

After a few more tests I've seen that the right approach is to compute the ground truth using a time step that is at least 8 times smaller than the smallest time step you wanna test. By doing so, results are basically independent of which integrator you use. These are the "clean" results I got by using this approach: consim_integration_error_using_euler_as_ground_truth

andreadelprete commented 4 years ago

Here another interesting result. Same test as before, I just changed the contact damping from 300 to 100. consim_integration_error_with_lower_contact_damping_1e2 The error of Exponential is basically the same, the error of Euler went up by a factor of 5! It seems that Exponential can handle oscillatory contacts much better than Euler!

andreadelprete commented 4 years ago

I've decided to do some tests to see how the integration accuracy degrades as we use a lower accuracy in the computation of the matrix exponential. The idea is that since computing the matrix exponential is the most expansive operation, if we don't need to compute it accurately to get good results then we can reduce its computation cost.

Computing the matrix exponential boils down to one matrix inversion (which we cannot avoid) and a bunch of matrix-matrix multiplications. It turns out that the matrix-matrix multiplications are rather expansive and take most of the time. So I've written a class that allows the user to compute the matrix exponential but with an upper bound on the number of matrix-matrix multiplications that are used. This of course could affect the accuracy of the results. We wanna see how these less accurate matrix exponentials affect the overall accuracy of our Exponential simulator.

First of all, let's see how many matrix-matrix multiplications are needed on average by our computations. Since the number of matrix-matrix multiplications depend on the norm of the matrix of which we wanna compute the exponential, in our simulator this number depends on the time step. Using a smaller time step we need less matrix multiplications, which is shown by the green line in this plot: mat_mult_expm_vs_ndt The other lines show the average number of matrix-matrix multiplications taken when we set a bound on the max number of matrix-matrix multiplications, and we can see that they basically always saturate at the bound.

Let's now see the integrator errors: integration_error_vs_ndt for different max_mat_mult_expm For ndt>=4, which corresponds to a time step <= 2.5 ms, we can see no difference when using inaccurate matrix exponentials, which is really good news because it means we can compute expm with just 2 matrix multiplications (rather than 9!), saving a lot of computation time! For ndt=2 (i.e. dt=5 ms) we see a significant degradation for the case of 2 matrix-multiplications. For `ndt=1 (i.e. dt=10 ms) we see something weird: using less matrix multiplications we actually can get better results! This is actually a known problem of algorithms computing expm, and it's called over-scaling. So in this case, we could use just 6 matrix multiplications (instead of 11) and even get a better result.

This result should be of particular interest for @olimexsmart , and I think these results suggest that we should add to the interface of expokit-cpp the option for the user to specify the maximum number of matrix multiplications to be used in the computation of the matrix exponential, because clearly we don't need to do as many as the theory says we should!

olimexsmart commented 4 years ago

Well this is nice! I see no problem in adding that parameter. At this point would make sense to look if floats could be used instead of doubles?

andreadelprete commented 4 years ago

That's for sure an alternative way to save some computation time. However, I don't think I can test that in Python, and considering we don't have much time left before the deadline (3 weeks) let's first implement this, and then if there is time we can also test using floats. You can take a look here to see how I implemented it. A few comments:

The number of matrix multiplication includes both the ones for squaring and the ones for the Pade approximation.
I was not using balancing in the test above, so there's room for improvement!
In Python I am using the new 2009 version of the algorithm by Higham, which is less prone to overscaling than the one we have in c++. This means that in C++ the gains could be even higher!
We should find a way to compute automatically the appropriate number of matrix multiplications from the matrix norm because these results show that we need less than the theory says, but it does not tell us exactly how many

andreadelprete commented 4 years ago

Since we got rather encouraging results by reducing the number of matrix multiplications allowed in expm, I've tried to push the test a bit further, testing also the cases of 1 and 0 matrix multiplications (in the previous test I had stopped at 2). These are the results: accuracy_bounded_mat_mult_expm mmm_bounded_mat_mult_expm

We can see that:

with dt<=0.5 ms using 0 mat. mult. is enough (instead of 6 or less, depending on dt)
with dt = 1 ms using 1 mat. mult. is enough (instead of 7)
with dt = 2 ms using 2 mat. mult. is enough (instead of 8)
with dt = 5 ms using 3 mat. mult. is enough (instead of 9)
with dt = 10 ms using 6 mat. mult. is enough (instead of 10)

How this affects the computation times will need to be investigated using the C++ code. I expect that these reductions in the # of mat. mult. will drastically reduce the computation time of expm, because the computation time should be roughly proportional to (1 + #mat-mult), where the 1 accounts for the matrix inversion. For instance, taking the case of dt=2, if the time to compute expm before was 100 us, reducing the number of mat. mult. from 8 to 2 the computation time should become 100*(1+2)/(1+8) = 33 us.

andreadelprete commented 4 years ago

I'm still playing with different variations of the same test described above. Today I've tried to see how the integration error varies with time and I got these results: integration_error_vs_time_zero_initial_force Here we can really see that the benefit of Exponential comes from the initial phase of the motion, where the robot starts with zero contact forces and so the contact forces change rapidly in the beginning. Euler is very inaccurate in this phase, but once this phase is over the errors of the two simulation schemes converge to the same value, which depend only on the time step (not on the scheme). This suggests that there are phases in which using Euler would be fine, therefore saving computation time, and phases in which instead we need something more, such as the Exponential integrator. Identifying those phases however does not seem trivial to me. I'm gonna investigate more.

andreadelprete commented 4 years ago

Rather than switching between Exponential and Euler, we could have a more granular control of the accuracy of the integration scheme by always using Exponential, but varying the number of matr. mult. used for expm. Here an example: integration_error_vs_time_with_mmm We can see that in the very beginning we need at least 3 mat. mult. to get accurate results, but then at the second time step (10 ms) we can already switch to 2 mat. mult. without any loss of accuracy; after 100 ms we can switch to 1 mat. mult., and finally after 150 ms we can even switch to 0 mat. mult. The challenge remains to figure out when to increase/decrease the number of mat. mult.

PS: I think the fact that between 10 and 60 ms the method using 1 mat. mult. performed better than the others is just a lucky case and we shouldn't expect to see this in general.

andreadelprete commented 4 years ago

I've finally succeeded reproducing the first results above (which were computed with the Python simulator) with the C++ simulator. Here they are: solo_squat_cpp_exp_gt_T_0 1 The exponential simulator behaves very similarly for small time steps, and a bit better for large time steps, which is good news. The Euler simulator seems to work a bit better in C++. Not clear why, but I think it doesn't matter much right now. We can move forward with the benchmarks.

andreadelprete / consim