mimesis-inria / caribou

Multi-physics computation library
GNU Lesser General Public License v3.0
29 stars 17 forks source link

Run time difference beteween SOFA and FEniCS #115

Closed Ziemnono closed 2 years ago

Ziemnono commented 2 years ago

During the NR iterations, we observed slight differences between FEniCS and SOFA forcefield. For example with NeoHooke, linear tetrahedron we even observe that FEniCS is faster than SOFA:

Time step                  : 0
Context                    : /sofa_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 9.405307e+03  |R|/|R0| = 1.000000e+00  |du| = 2.811582e+01  |du| / |U| = 1.000000e+00  Time = 11 ms
Newton iteration #2      |R| = 4.779478e+01  |R|/|R0| = 5.081682e-03  |du| = 1.079542e+00  |du| / |U| = 3.846731e-02  Time = 5 ms
Newton iteration #3      |R| = 2.630266e-02  |R|/|R0| = 2.796577e-06  |du| = 2.790012e-02  |du| / |U| = 9.943424e-04  Time = 4 ms
Newton iteration #4      |R| = 2.394271e-07  |R|/|R0| = 2.545659e-11  |du| = 9.330179e-05  |du| / |U| = 3.325209e-06  Time = 4 ms
[CONVERGED] The residual's norm |R| = 2.39427e-07 is smaller than the threshold of 1e-05.

[INFO]    [StaticSolver] ======= Starting static ODE solver =======
Time step                  : 0
Context                    : /fenics_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 9.405308e+03  |R|/|R0| = 1.000000e+00  |du| = 2.811582e+01  |du| / |U| = 1.000000e+00  Time = 6 ms
Newton iteration #2      |R| = 4.779527e+01  |R|/|R0| = 5.081733e-03  |du| = 1.079542e+00  |du| / |U| = 3.846732e-02  Time = 2 ms
Newton iteration #3      |R| = 2.630541e-02  |R|/|R0| = 2.796869e-06  |du| = 2.790016e-02  |du| / |U| = 9.943439e-04  Time = 1 ms
Newton iteration #4      |R| = 2.381400e-07  |R|/|R0| = 2.531975e-11  |du| = 9.330410e-05  |du| / |U| = 3.325292e-06  Time = 1 ms
[CONVERGED] The residual's norm |R| = 2.3814e-07 is smaller than the threshold of 1e-05.

But for quadratic tetrahedron FEniCS is much much slower:

Time step                  : 0
Context                    : /sofa_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 1.687765e+05  |R|/|R0| = 1.000000e+00  |du| = 3.533437e+02  |du| / |U| = 1.000000e+00  Time = 64 ms
Newton iteration #2      |R| = 3.190437e+04  |R|/|R0| = 1.890333e-01  |du| = 8.710124e+01  |du| / |U| = 2.703041e-01  Time = 23 ms
Newton iteration #3      |R| = 1.854729e+03  |R|/|R0| = 1.098926e-02  |du| = 3.032463e+01  |du| / |U| = 1.017289e-01  Time = 16 ms
Newton iteration #4      |R| = 1.113557e+03  |R|/|R0| = 6.597823e-03  |du| = 1.337765e+01  |du| / |U| = 4.329553e-02  Time = 17 ms
Newton iteration #5      |R| = 6.940493e+01  |R|/|R0| = 4.112239e-04  |du| = 6.147766e+00  |du| / |U| = 1.951930e-02  Time = 17 ms
Newton iteration #6      |R| = 3.601234e+00  |R|/|R0| = 2.133729e-05  |du| = 1.262625e+00  |du| / |U| = 3.993332e-03  Time = 17 ms
Newton iteration #7      |R| = 1.631458e-03  |R|/|R0| = 9.666382e-09  |du| = 2.773893e-02  |du| / |U| = 8.772310e-05  Time = 16 ms
Newton iteration #8      |R| = 3.711575e-09  |R|/|R0| = 2.199106e-14  |du| = 3.061373e-05  |du| / |U| = 9.681453e-08  Time = 17 ms
[CONVERGED] The residual's norm |R| = 3.71157e-09 is smaller than the threshold of 1e-05.

[INFO]    [StaticSolver] ======= Starting static ODE solver =======
Time step                  : 0
Context                    : /fenics_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 1.687706e+05  |R|/|R0| = 1.000000e+00  |du| = 3.533437e+02  |du| / |U| = 1.000000e+00  Time = 271 ms
Newton iteration #2      |R| = 3.190224e+04  |R|/|R0| = 1.890272e-01  |du| = 8.710317e+01  |du| / |U| = 2.703159e-01  Time = 255 ms
Newton iteration #3      |R| = 1.856666e+03  |R|/|R0| = 1.100112e-02  |du| = 3.035003e+01  |du| / |U| = 1.018252e-01  Time = 252 ms
Newton iteration #4      |R| = 1.118024e+03  |R|/|R0| = 6.624515e-03  |du| = 1.333624e+01  |du| / |U| = 4.317372e-02  Time = 243 ms
Newton iteration #5      |R| = 6.948478e+01  |R|/|R0| = 4.117113e-04  |du| = 6.151740e+00  |du| / |U| = 1.953710e-02  Time = 263 ms
Newton iteration #6      |R| = 3.632954e+00  |R|/|R0| = 2.152599e-05  |du| = 1.267090e+00  |du| / |U| = 4.008459e-03  Time = 250 ms
Newton iteration #7      |R| = 1.646275e-03  |R|/|R0| = 9.754512e-09  |du| = 2.786253e-02  |du| / |U| = 8.813603e-05  Time = 248 ms
Newton iteration #8      |R| = 3.021262e-09  |R|/|R0| = 1.790158e-14  |du| = 3.099845e-05  |du| / |U| = 9.805572e-08  Time = 244 ms
[CONVERGED] The residual's norm |R| = 3.02126e-09 is smaller than the threshold of 1e-05.

What could cause this difference? @Ziemnono check the number of quadrature points used in FEniCS compared with SOFA.

jnbrunet commented 2 years ago

Arnaud did you find where the difference came from?

Ziemnono commented 2 years ago

Hi @jnbrunet, Yes indeed, as Jack pointed out, FEniCS is choosing automatically the quadrature degree. For quadratic tetrahedron, we used quadrature of degree 2 in Caribou, so I forced FEniCS to also use the same quadrature degree. This is the result of the same previous example. We see that FEniCS is 10 times faster than before and also slightly faster than Caribou :)

Time step                  : 0
Context                    : /sofa_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 1.687765e+05  |R|/|R0| = 1.000000e+00  |du| = 3.533437e+02  |du| / |U| = 1.000000e+00  Time = 60 ms
Newton iteration #2      |R| = 3.190437e+04  |R|/|R0| = 1.890333e-01  |du| = 8.710124e+01  |du| / |U| = 2.703041e-01  Time = 22 ms
Newton iteration #3      |R| = 1.854729e+03  |R|/|R0| = 1.098926e-02  |du| = 3.032463e+01  |du| / |U| = 1.017289e-01  Time = 18 ms
Newton iteration #4      |R| = 1.113557e+03  |R|/|R0| = 6.597823e-03  |du| = 1.337765e+01  |du| / |U| = 4.329553e-02  Time = 18 ms
Newton iteration #5      |R| = 6.940493e+01  |R|/|R0| = 4.112239e-04  |du| = 6.147766e+00  |du| / |U| = 1.951930e-02  Time = 16 ms
Newton iteration #6      |R| = 3.601234e+00  |R|/|R0| = 2.133729e-05  |du| = 1.262625e+00  |du| / |U| = 3.993332e-03  Time = 18 ms
Newton iteration #7      |R| = 1.631458e-03  |R|/|R0| = 9.666382e-09  |du| = 2.773893e-02  |du| / |U| = 8.772310e-05  Time = 19 ms
Newton iteration #8      |R| = 3.711575e-09  |R|/|R0| = 2.199106e-14  |du| = 3.061373e-05  |du| / |U| = 9.681453e-08  Time = 16 ms
[CONVERGED] The residual's norm |R| = 3.71157e-09 is smaller than the threshold of 1e-05.

[INFO]    [StaticSolver] ======= Starting static ODE solver =======
Time step                  : 0
Context                    : /fenics_node
Max number of iterations   : 25
Residual tolerance (abs)   : 1e-05
Residual tolerance (rel)   : 1e-10
Correction tolerance (abs) : 1e-05
Correction tolerance (rel) : 1e-15
Newton iteration #1      |R| = 1.687765e+05  |R|/|R0| = 1.000000e+00  |du| = 3.533437e+02  |du| / |U| = 1.000000e+00  Time = 40 ms
Newton iteration #2      |R| = 3.190439e+04  |R|/|R0| = 1.890333e-01  |du| = 8.710124e+01  |du| / |U| = 2.703041e-01  Time = 20 ms
Newton iteration #3      |R| = 1.854733e+03  |R|/|R0| = 1.098928e-02  |du| = 3.032469e+01  |du| / |U| = 1.017292e-01  Time = 18 ms
Newton iteration #4      |R| = 1.113562e+03  |R|/|R0| = 6.597851e-03  |du| = 1.337769e+01  |du| / |U| = 4.329566e-02  Time = 16 ms
Newton iteration #5      |R| = 6.940510e+01  |R|/|R0| = 4.112249e-04  |du| = 6.147781e+00  |du| / |U| = 1.951935e-02  Time = 15 ms
Newton iteration #6      |R| = 3.601271e+00  |R|/|R0| = 2.133751e-05  |du| = 1.262632e+00  |du| / |U| = 3.993356e-03  Time = 17 ms
Newton iteration #7      |R| = 1.631277e-03  |R|/|R0| = 9.665311e-09  |du| = 2.773915e-02  |du| / |U| = 8.772381e-05  Time = 15 ms
Newton iteration #8      |R| = 3.310869e-09  |R|/|R0| = 1.961688e-14  |du| = 3.061413e-05  |du| / |U| = 9.681580e-08  Time = 15 ms
[CONVERGED] The residual's norm |R| = 3.31087e-09 is smaller than the threshold of 1e-05.