Adding support for MUMPS sparse linear solver

zanellia commented 2 years ago

I am working on a package that relies on the Schur complement method in qpOASES and having to rely on MA57 makes it more cumbersome for people to install it. For this reason I have added support for the freely redistributable solver MUMPS.

In order to use MUMPS it define USE_SOLVER = MUMPS, e.g., in make_linux.mk. The Makefile will obtain the source code of MUMPS and compile and install it to <qpOASESroot>external/mumps_installation/lib automatically. In order to run qpOASES, <qpOASESroot>external/mumps_installation/lib needs to be present in LD_LIBRARY_PATH.

The tests are still not passing, however the solver takes exactly the same iterations until MUMPS fails to compute a factorization (at least for test_qrecipeSchur).

Also, I am not sure who does code reviews at the moment. Maybe @jferreau or @apotschka?

CLAassistant commented 2 years ago

All committers have signed the CLA.

zanellia commented 2 years ago

This is the output of ./qrecipeSchur using MA57 and MUMPS, respectively. I am printing the iterates and they are equivalent until iteration 91 and then the residuals of the iterative refinement takes very large values (~1e12). That happens around line 31501 (the files can be easily compared with, e.g., vimdiff).

ma57.log mumps.log

zanellia commented 2 years ago

In order to investigate the issue, I have added a script to generate the data for sparse QPs from Python. I have added to the repo the data for 2 QPs and example qrecipeSchur.cpp can be run with with different data commenting/uncommenting https://github.com/zanellia/qpOASES/blob/mumps/examples/qrecipeSchur.cpp#L40. Surprisingly, for the a trivial QP, the dense and sparse versions of qpOASES return different results:

Primal error (dense and sparse): 2.45e+02 while (with MUMPS) Primal error (dense and Schur): 2.71e-08

with MA57, I obtain a slightly better accuracy for the Schur complement method: Primal error (dense and sparse): 2.45e+02 (obviously) Primal error (dense and Schur): 4.55e-14

In conclusion, I would be tempted to say that the interface to MUMPS seems to work OK - MUMPS is just not very accurate in some cases. I was going to update the tests, but I am not sure how to proceed given the fact that the nullspace method returns a very different solution.

apotschka commented 2 years ago

In order to investigate the issue, I have added a script to generate the data for sparse QPs from Python. I have added to the repo the data for 2 QPs and example qrecipeSchur.cpp can be run with with different data commenting/uncommenting https://github.com/zanellia/qpOASES/blob/mumps/examples/qrecipeSchur.cpp#L40. Surprisingly, for the a trivial QP, the dense and sparse versions of qpOASES return different results:

Primal error (dense and sparse): 2.45e+02 while (with MUMPS) Primal error (dense and Schur): 2.71e-08

with MA57, I obtain a slightly better accuracy for the Schur complement method: Primal error (dense and sparse): 2.45e+02 (obviously) Primal error (dense and Schur): 4.55e-14

In conclusion, I would be tempted to say that the interface to MUMPS seems to work OK - MUMPS is just not very accurate in some cases. I was going to update the tests, but I am not sure how to proceed given the fact that the nullspace method returns a very different solution.

I did not have time yet to investigate this issue further.

zanellia commented 2 years ago

Huge thanks, Andrea. Your contribution is greatly appreciated. This really fills a wide-open gap!

I have made a couple comments that I would like to ask you to address before merging the pull request. Some of them on earlier commits you have already addressed yourself in later commits.

Thanks for the detailed code review @apotschka! I am addressing your comments today. On a high level:

Most importantly:

* We need to be able to change the C and Fortran integer data types via `__USE_LONG_INTEGERS__` and `__USE_LONG_FINTS__`.

I had troubles getting qpOASES running with some version of MA57 (especially running tests of my solver on Travis CI, the MATLAB version used on the server would affect the MA57 version which in turn caused erratic behavior with, e.g., segfaults). So the change is actually related to MA57 rather than MUMPS. I will try to get to a more precise conclusion regarding what breaks, but for now I can definitely remove it from this PR.

* Does it still compile on Windows and MacOS? If you cannot answer that yet, we might still complete the merge and open two tickets to check that.

Unfortunately, I do not have access to a macOS machine at the moment. Could this be an additional reason to setup , e.g., Travis CI for the repo? If that's useful, I can look into it.

apotschka commented 2 years ago

Unfortunately, I do not have access to a macOS machine at the moment. Could this be an additional reason to setup , e.g., Travis CI for the repo? If that's useful, I can look into it.

The addition of Travis CI would be really nice for future developments. If you have time to set it up, please go for it. If you need extended access rights here on GitHub, please let me know.

zanellia commented 2 years ago

@apotschka I have addressed most of your comments. In summary:

Could not test on Win/mac yet
I reverted the changes to the definitions of int_t and fint_t. I am still not sure why you'd need to pass long integers (64 bits) to MA57's API which takes Fortran integers (32 bit). This was actually causing segfaults with some versions of MA57. That's why I added that hack (and then forgot to clean it up).
The qrecipe_Schur example and associated test run fine after removing options.enableEqualities = BT_TRUE. Not sure why. If you or someone else has a possible explanation I would be interested :)
Unfortunately the smallSchur test is still failing.

zanellia commented 2 years ago

@apotschka @jferreau is there anything I can/should do in order to proceed?

apotschka commented 2 years ago

I reverted the changes to the definitions of int_t and fint_t. I am still not sure why you'd need to pass long integers (64 bits) to MA57's API which takes Fortran integers (32 bit). This was actually causing segfaults with some versions of MA57. That's why I added that hack (and then forgot to clean it up).

Matlab ships with an MA57 with long integers. If you want to run qpOASES from within Matlab with MA57, you need to take care of that.

The qrecipe_Schur example and associated test run fine after removing options.enableEqualities = BT_TRUE. Not sure why. If you or someone else has a possible explanation I would be interested :)

Unfortunately the smallSchur test is still failing.

Unfortunately, I do not have time right now to look into these issues. Number 3 sounds like problems with almost singular matrices and different behavior of different linear solvers in that case.

I'll merge the pull request now and hope potential bugs will be fixed later in a community effort.

coin-or / qpOASES

Adding support for MUMPS sparse linear solver #133