coin-or / Ipopt

COIN-OR Interior Point Optimizer IPOPT
https://coin-or.github.io/Ipopt
Other
1.43k stars 284 forks source link

Please provide support for newer version of MUMPS like 5.1.2 #310

Closed svigerske closed 5 years ago

svigerske commented 5 years ago

Issue created by migration from Trac.

Original creator: JasonZhou

Original creation time: 2018-10-24 23:21:31

Assignee: ipopt-team

Version: 3.12

JasonZhou404 commented 5 years ago

Could the maintaince somehow support the compile option of parallel version of Mumps?

svigerske commented 5 years ago

Ipopt 3.13.0 and actually also 3.12.13 and maybe a few before can be used with Mumps 5.

The ThirdParty-Mumps project has a branch mumps5 to build a Mumps 5.2.1 library that could easily be picked up by the Ipopt 3.13 buildsystem.

However, Mumps 4.10.0 is currently still the default, since as at least on CUTEst, it seems to offer better performance than 5.2.1 on average. Maybe CUTEst problems are just too small or some Mumps flags should be changed. This needs further investigation.

dpo commented 4 years ago

Hi all. Are there any news on this issue? MUMPS 5.3.0 came out today, and it would be good to know if IPOPT recommends using version 5. A fair number of improvements and bug fixes have taken place since version 4: http://mumps.enseeiht.fr/index.php?page=dwnld#cl

Thanks!

svigerske commented 4 years ago

I've now build Ipopt master once with MUMPS 4.10.0 and once with MUMPS 5.3.1, both via the buildsystem of https://github.com/coin-or-tools/ThirdParty-Mumps/ (branch master for MUMPS 4.10.0 and branch mumps5 for MUMPS 5.3.1). Both MUMPS versions are build with Metis 5.1.0 and OpenMP has been disabled for MUMPS 5. For linear algebra, I use Intel MKL serial. So Mumps 5 is build with -DGEMMT_AVAILABLE.

I've run CUTEst ($MASTSIF/*.SIF) and set a timelimit of 900s and an iteration limit of 100000000. Here are two csv with the results per instance: mumps4.txt mumps5.txt TerminationStatus Normal corresponds to Exit code 0 in the log.

On 1081 instances, Ipopt terminated normal with both Mumps 4 and Mumps 5. Here is a scatter plot for the total time spend by Ipopt: totaltime_scatter

Of course there can be side effects due to time spend in other parts than MUMPS, different results form the linear solver that lead to a different path, etc. Here is a scatter plot of the time spend in for the linear solver, divided by the number of iterations: linsolvertime_scatter

With more effort, one would be able to compare better the time spend by MUMPS only, e.g., using exactly the same input matrices. However, for an Ipopt user, the total time spend in Ipopt is probably more interesting, so I would argue that the first plot applies. In both plots, I would see a tendency towards MUMPS 4 still. Here is the Python script that generated these plots from the input files: eval.txt

The preprint https://arxiv.org/pdf/1909.08104.pdf mentions that CUTEst instances may often be too small. I tried taking the number of nonzeros in the Hessian of the Lagrangian into account, but that didn't really change the picture. So, from this comparison, I don't see that I would clearly recommend to use Mumps 5.3.1 over Mumps 4.10.0 to achieve better single-thread performance. On other testsets with larger instances or when using more cores, this could look different. Also stability and "build-ability" could play an increasing role.

If someone has an idea how to improve the Mumps interface in Ipopt to work better with Mumps 5, I'm all ear.

damienhocking commented 4 years ago

I'm the original developer of the MUMPS interface for IPOPT. Yes, I'm that old. What else would you need the interface to do? We've been using MUMPS 5 in IPOPT since 5.0 came out. For small, single-threaded problems there hasn't been any major improvements between 4.x and 5.x, most of that comes from the matrix ordering you choose and the BLAS library. On larger problems (1e6+ variables) MUMPS 5.x offers significant speedups over 4.x because of the OpenMP parallelism it has. We also found that Intel MKL Pardiso is the absolute fastest linear solver, but that's on our problems which don't reflect the breadth of the CUTest set.

svigerske commented 4 years ago

I wonder what is causing this frequent slowdown when only switching from single-threaded MUMPS 4 to single-threaded MUMPS 5 on this testset. Is it indeed that MUMPS became slower on this end because the MUMPS development focused solely on large problems and multi-thread usage? Or is there some option of MUMPS that one could set in the Ipopt/MUMPS interface to get at least similar performance again?

It is just difficult to suggest switching from MUMPS 4 to MUMPS 5 when Ipopts single-thread performance will suffer on a testset that is as widely accepted as CUTEst.

damienhocking commented 4 years ago

The best answer for users might be to support both. MUMPS 5 for large-scale and MUMPS4 for small-scale.

svigerske commented 4 years ago

Well, we support both for now. But one has to choose at buildtime and one needs to read the docu a bit careful to see how to switch to MUMPS 5.

On the CUTEst run I posted above, there are only 7 instances that were solved by both within the timelimit and that had more than 1 million nonzeros in the Hessian of the Lagrangian. It doesn't seem as if MUMPS5 was giving an advantage there:

                 mumps4  mumps5      nnz
                 time    time
CHARDIS0           2.72    2.64  1001000
WALL100           12.31   13.33  1446475
BA-L16LS         181.50  193.58  2393742
EIGENACO          15.86   16.45  3252525
EIGENALS         272.09  269.32  3252525
EIGENCCO         162.29  156.27  3517878
ODNAMUR          500.70  531.87  9385795

So this is still too small then.