ami-iit / bipedal-locomotion-framework

Suite of libraries for achieving bipedal locomotion on humanoid robots
https://ami-iit.github.io/bipedal-locomotion-framework/
BSD 3-Clause "New" or "Revised" License
147 stars 38 forks source link

CentroidalMPC test fails with ma97_factor Matrix found to be singular #801

Closed GiulioRomualdi closed 6 months ago

GiulioRomualdi commented 8 months ago

I tried to run the CentroidalMPCTest on a computer with Ipopt version 3.11.9 with hsl software installed with https://github.com/ami-iit/coinhsl-binary-packages/releases/tag/v2019.05.21.1 The solver prints this error

Error return from ma97_factor. Error flag =  -7
Matrix found to be singular

Here the full trace

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************

This is Ipopt version 3.11.9, running with linear solver ma97.

Number of nonzeros in equality constraint Jacobian...:     1779
Number of nonzeros in inequality constraint Jacobian.:     1152
Number of nonzeros in Lagrangian Hessian.............:     2292

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular
Total number of variables............................:      555
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:      267
Total number of inequality constraints...............:      384
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:      384

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  0.0000000e+00 9.81e-01 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular
   1  2.0941018e+02 5.13e-03 6.08e+02  -1.7 1.51e+00  -2.0 1.65e-02 1.00e+00h  1

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular

 Error return from ma97_factor. Error flag =  -7
 Matrix found to be singular
   2  6.9412883e+01 4.53e-02 3.37e+02  -1.7 8.73e-01   1.1 1.64e-01 1.00e+00f  1
   3  3.1503730e+01 1.30e-02 2.75e+01  -1.7 6.97e-01   0.7 5.20e-01 1.00e+00f  1
   4  1.9367003e+01 1.57e-03 1.47e+01  -1.7 3.62e-01   0.2 8.24e-01 1.00e+00f  1
   5  1.7140886e+01 5.94e-04 3.43e-01  -1.7 1.41e-01  -0.3 1.00e+00 1.00e+00f  1
   6  1.6773502e+01 7.78e-05 3.99e-01  -1.7 1.40e-01  -0.8 1.00e+00 1.00e+00f  1
   7  1.6676290e+01 1.19e-05 9.64e-02  -1.7 7.25e-02  -1.3 1.00e+00 1.00e+00f  1
   8  1.6459613e+01 1.92e-05 1.17e-01  -2.5 7.58e-02  -1.7 1.00e+00 1.00e+00f  1
   9  1.6361429e+01 9.01e-06 8.66e-02  -2.5 5.98e-02  -2.2 1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  1.6351440e+01 3.02e-07 1.74e-03  -2.5 8.42e-03  -2.7 1.00e+00 1.00e+00h  1
  11  1.6314655e+01 3.15e-06 4.81e-02  -3.8 4.35e-02  -3.2 9.70e-01 1.00e+00f  1
  12  1.6306236e+01 1.30e-06 1.10e-02  -3.8 1.95e-02  -3.6 1.00e+00 1.00e+00f  1
  13  1.6305337e+01 2.20e-08 3.36e-04  -3.8 3.13e-03  -4.1 1.00e+00 1.00e+00h  1
  14  1.6304581e+01 1.15e-07 1.42e-03  -5.7 6.91e-03  -4.6 9.71e-01 1.00e+00f  1
  15  1.6304535e+01 1.32e-09 2.96e-05  -5.7 8.82e-04  -5.1 1.00e+00 1.00e+00h  1
  16  1.6304534e+01 5.12e-12 2.14e-07  -5.7 1.03e-04  -5.5 1.00e+00 1.00e+00h  1
  17  1.6304534e+01 2.76e-11 4.20e-07  -8.6 1.14e-04  -6.0 1.00e+00 1.00e+00h  1
  18  1.6304534e+01 6.99e-15 4.87e-11  -8.6 7.26e-07  -6.5 1.00e+00 1.00e+00h  1

Number of Iterations....: 18

                                   (scaled)                 (unscaled)
Objective...............:   1.6304533651274518e+01    1.6304533651274518e+01
Dual infeasibility......:   4.8689069272802319e-11    4.8689069272802319e-11
Constraint violation....:   6.9944050551384862e-15    6.9944050551384862e-15
Complementarity.........:   2.5127650985068144e-09    2.5127650985068144e-09
Overall NLP error.......:   2.5127650985068144e-09    2.5127650985068144e-09

Number of objective function evaluations             = 19
Number of objective gradient evaluations             = 19
Number of equality constraint evaluations            = 19
Number of inequality constraint evaluations          = 19
Number of equality constraint Jacobian evaluations   = 19
Number of inequality constraint Jacobian evaluations = 19
Number of Lagrangian Hessian evaluations             = 18
Total CPU secs in IPOPT (w/o function evaluations)   =      0.031
Total CPU secs in NLP function evaluations           =      0.002

EXIT: Optimal Solution Found.
      solver  :   t_proc      (avg)   t_wall      (avg)    n_eval
       nlp_f  | 121.00us (  6.37us) 119.32us (  6.28us)        19
       nlp_g  | 315.00us ( 16.58us) 313.63us ( 16.51us)        19
  nlp_grad_f  | 239.00us ( 11.95us) 249.37us ( 12.47us)        20
  nlp_hess_l  | 339.00us ( 18.83us) 342.29us ( 19.02us)        18
   nlp_jac_g  | 476.00us ( 23.80us) 475.07us ( 23.75us)        20
       total  |  33.47ms ( 33.47ms)  33.51ms ( 33.51ms)         1

@S-Dafarra @traversaro have you ever seen this kind of error?

GiulioRomualdi commented 8 months ago

I tried with ma27 and everything seems working fine

******************************************************************************
This program contains Ipopt, a library for large-scale nonlinear optimization.
 Ipopt is released as open source code under the Eclipse Public License (EPL).
         For more information visit http://projects.coin-or.org/Ipopt
******************************************************************************

This is Ipopt version 3.11.9, running with linear solver ma27.

Number of nonzeros in equality constraint Jacobian...:     1779
Number of nonzeros in inequality constraint Jacobian.:     1152
Number of nonzeros in Lagrangian Hessian.............:     2292

Total number of variables............................:      555
                     variables with only lower bounds:        0
                variables with lower and upper bounds:        0
                     variables with only upper bounds:        0
Total number of equality constraints.................:      267
Total number of inequality constraints...............:      384
        inequality constraints with only lower bounds:        0
   inequality constraints with lower and upper bounds:        0
        inequality constraints with only upper bounds:      384

iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
   0  0.0000000e+00 9.81e-01 1.00e+00  -1.0 0.00e+00    -  0.00e+00 0.00e+00   0
   1  2.0941018e+02 5.13e-03 6.08e+02  -1.7 1.51e+00  -2.0 1.65e-02 1.00e+00h  1
   2  6.9412883e+01 4.53e-02 3.37e+02  -1.7 8.73e-01   1.1 1.64e-01 1.00e+00f  1
   3  3.1503730e+01 1.30e-02 2.75e+01  -1.7 6.97e-01   0.7 5.20e-01 1.00e+00f  1
   4  1.9367003e+01 1.57e-03 1.47e+01  -1.7 3.62e-01   0.2 8.24e-01 1.00e+00f  1
   5  1.7140886e+01 5.94e-04 3.43e-01  -1.7 1.41e-01  -0.3 1.00e+00 1.00e+00f  1
   6  1.6773502e+01 7.78e-05 3.99e-01  -1.7 1.40e-01  -0.8 1.00e+00 1.00e+00f  1
   7  1.6676290e+01 1.19e-05 9.64e-02  -1.7 7.25e-02  -1.3 1.00e+00 1.00e+00f  1
   8  1.6459613e+01 1.92e-05 1.17e-01  -2.5 7.58e-02  -1.7 1.00e+00 1.00e+00f  1
   9  1.6361429e+01 9.01e-06 8.66e-02  -2.5 5.98e-02  -2.2 1.00e+00 1.00e+00f  1
iter    objective    inf_pr   inf_du lg(mu)  ||d||  lg(rg) alpha_du alpha_pr  ls
  10  1.6351440e+01 3.02e-07 1.74e-03  -2.5 8.42e-03  -2.7 1.00e+00 1.00e+00h  1
  11  1.6314655e+01 3.15e-06 4.81e-02  -3.8 4.35e-02  -3.2 9.70e-01 1.00e+00f  1
  12  1.6306236e+01 1.30e-06 1.10e-02  -3.8 1.95e-02  -3.6 1.00e+00 1.00e+00f  1
  13  1.6305337e+01 2.20e-08 3.36e-04  -3.8 3.13e-03  -4.1 1.00e+00 1.00e+00h  1
  14  1.6304581e+01 1.15e-07 1.42e-03  -5.7 6.91e-03  -4.6 9.71e-01 1.00e+00f  1
  15  1.6304535e+01 1.32e-09 2.96e-05  -5.7 8.82e-04  -5.1 1.00e+00 1.00e+00h  1
  16  1.6304534e+01 5.12e-12 2.14e-07  -5.7 1.03e-04  -5.5 1.00e+00 1.00e+00h  1
  17  1.6304534e+01 2.76e-11 4.20e-07  -8.6 1.14e-04  -6.0 1.00e+00 1.00e+00h  1
  18  1.6304534e+01 6.99e-15 4.87e-11  -8.6 7.26e-07  -6.5 1.00e+00 1.00e+00h  1

Number of Iterations....: 18

                                   (scaled)                 (unscaled)
Objective...............:   1.6304533651274522e+01    1.6304533651274522e+01
Dual infeasibility......:   4.8687741821402107e-11    4.8687741821402107e-11
Constraint violation....:   6.9944050551384862e-15    6.9944050551384862e-15
Complementarity.........:   2.5127650985070063e-09    2.5127650985070063e-09
Overall NLP error.......:   2.5127650985070063e-09    2.5127650985070063e-09

Number of objective function evaluations             = 19
Number of objective gradient evaluations             = 19
Number of equality constraint evaluations            = 19
Number of inequality constraint evaluations          = 19
Number of equality constraint Jacobian evaluations   = 19
Number of inequality constraint Jacobian evaluations = 19
Number of Lagrangian Hessian evaluations             = 18
Total CPU secs in IPOPT (w/o function evaluations)   =      0.027
Total CPU secs in NLP function evaluations           =      0.001

EXIT: Optimal Solution Found.
      solver  :   t_proc      (avg)   t_wall      (avg)    n_eval
       nlp_f  |  90.00us (  4.74us)  90.53us (  4.76us)        19
       nlp_g  | 253.00us ( 13.32us) 249.43us ( 13.13us)        19
  nlp_grad_f  | 183.00us (  9.15us) 183.75us (  9.19us)        20
  nlp_hess_l  | 286.00us ( 15.89us) 286.46us ( 15.91us)        18
   nlp_jac_g  | 405.00us ( 20.25us) 403.76us ( 20.19us)        20
       total  |  29.28ms ( 29.28ms)  29.28ms ( 29.28ms)         1
S-Dafarra commented 8 months ago

@S-Dafarra @traversaro have you ever seen this kind of error?

No. Maybe you can try decreasing the pivoting threshold: ma97_u. Maybe some numbers are close to zero, but they are treated as zeros.

Also, ma97 has some complex scaling mechanism. You can try to set ma97_scaling to none

GiulioRomualdi commented 8 months ago

I noticed that on the PC I'm testing there is libblas-dev installed and not libopenblas-dev as it is on my pc where I installed hsl from source following https://github.com/ami-iit/ami-commons/blob/master/doc/casadi-ipopt-hsl.md. However I think that https://github.com/ami-iit/coinhsl-binary-packages/releases/tag/v2019.05.21.1 is build with libblas-dev so probabily I should try to recompile HSL with openblas

GiulioRomualdi commented 8 months ago

I checked the alternatives of blas on the PC I'm testing the stuff

update-alternatives --config libblas.so.3-x86_64-linux-gnu
There are 2 choices for the alternative libblas.so.3-x86_64-linux-gnu (providing /usr/lib/x86_64-linux-gnu/libblas.so.3).

  Selection    Path                                                     Priority   Status
------------------------------------------------------------
* 0            /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3   100       auto mode
  1            /usr/lib/x86_64-linux-gnu/blas/libblas.so.3               10        manual mode
  2            /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3   100       manual mode

That's the same configuration of my pc

S-Dafarra commented 8 months ago

I noticed that on the PC I'm testing there is libblas-dev installed and not libopenblas-dev as it is on my pc where I installed hsl from source following https://github.com/ami-iit/ami-commons/blob/master/doc/casadi-ipopt-hsl.md. However I think that https://github.com/ami-iit/coinhsl-binary-packages/releases/tag/v2019.05.21.1 is build with libblas-dev so probabily I should try to recompile HSL with openblas

This is a good point. What if you change the alternative to use libblas-dev also on the PC?

traversaro commented 8 months ago

I never saw this error. However all blas implementation should be ABI-compatible.

traversaro commented 8 months ago

We are also using quite an old hsl version, we could try to update it.

GiulioRomualdi commented 8 months ago

~On my setup I'm using releases/2.2.1 for HSL~ On my setup I'm using 2019-05-21 that's the same version of the setup I'm trying to run the test. So the only thing that differs is IPOPT. On my setup is complied with coinbrew while in the PC where I'm testing the controller is installed with APT

GiulioRomualdi commented 8 months ago

https://github.com/ami-iit/bipedal-locomotion-framework/issues/801#issuecomment-1909862173 updated

GiulioRomualdi commented 8 months ago
So to recap library Pc where I test the controller (https://github.com/ami-iit/bipedal-locomotion-framework/issues/801#issue-2098580681) My pc
ipopt 3.11.9 installed with APT 3.14.12 installed with coinbrew
HSL 2019-05-21 (installed with apt https://github.com/ami-iit/coinhsl-binary-packages/releases/tag/v2019.05.21.1) 2019-05-21 (compiled with coinbrew)
traversaro commented 8 months ago

So perhaps it is a problem of ancient ipopt, if I understood correctly?

GiulioRomualdi commented 8 months ago

Probably yes.

GiulioRomualdi commented 6 months ago

Upon testing the code again on a PC with ipopt installed with apt but with recent HSL solvers we concluded that this issue is related to an old version of ipopt