jump-dev / CSDP.jl

A Julia interface to the Coin-OR solver CSDP
https://projects.coin-or.org/Csdp/
Other
21 stars 9 forks source link

Sporadic malloc errors with CSDP 4.1 #39

Closed mgreiff closed 4 years ago

mgreiff commented 5 years ago

I should preface this by saying that I'm a novice when it comes to optimization in Julia and JuMP.

I ran into some problems when using CSDP to solve a set of LMIs related H-infinity synthesis in robust control. To illustrate the problem, I have created a short script (see CSDP_issue_malloc.txt) which solves the same problem over and over again in a for-loop. It generates the correct solution repeatedly, until it eventually fails due to one of two reasons: a malloc issue (see printout 1), or not being able to find a feasible solution (see printout 2).

Typically these failures occur after 50-100 correctly solved problems. I have solved the same problem using CVX, as well as the and the interior-point methods by Nesterov used in Matlab's LMI solvers, in both cases yielding a gain of gamma=1.45. So the CSDP solver seems to work nicely, apart from the two sporadically occurring errors. I have run the same experiments with using SCS in Julia, and this works wihthout malloc errors, so the problem seems to be related to CSDP.jl and not JuMP.jl.

The first error might be related to your issue #2 , with the malloc leading to a segfault in my case, but I know too little about the internals of your CSDP-wrapper to accurately debug it. I thought I'd alert you to this issue, and would be thankful for any ideas on what may be causing the problems.

printout 1

~~~ Run 71 ~~~
CSDP 6.2.0
Iter:  0 Ap: 0.00e+00 Pobj: -5.0377634e+02 Ad: 0.00e+00 Dobj:  0.0000000e+00
Iter:  1 Ap: 1.00e+00 Pobj: -5.9563600e+02 Ad: 7.35e-01 Dobj: -2.4754075e-01
Iter:  2 Ap: 1.00e+00 Pobj: -5.9529382e+02 Ad: 9.48e-01 Dobj:  2.6619412e-03
julia(35362,0x7fff75c76000) malloc: *** error for object 0x7fdbbad12008:
incorrect checksum for freed object - object was probably modified after being
  freed. *** set a breakpoint in malloc_error_break to debug

signal (6): Abort trap: 6
in expression starting at no file:0
__pthread_kill at /usr/lib/system/libsystem_kernel.dylib (unknown line)
Allocations: 98455385 (Pool: 98433939; Big: 21446); GC: 291
Abort trap: 6

printout 2

~~~ Run 49 ~~~
CSDP 6.2.0
Iter:  0 Ap: 0.00e+00 Pobj: -5.0377634e+02 Ad: 0.00e+00 Dobj:  0.0000000e+00
Iter:  1 Ap: 1.00e+00 Pobj: -5.9695547e+02 Ad: 7.35e-01 Dobj: -8.6584995e-01
.
.
.
Iter: 50 Ap: 7.86e-09 Pobj: -1.1346790e+18 Ad: 2.32e-08 Dobj: -8.7559876e+03
Stuck at edge of primal feasibility, giving up. 
Test Failed at REPL[10]:6
  Expression: abs(G - γ) <= 0.001
   Evaluated: 595.502673809033 <= 0.001
ERROR: There was an error during testing
blegat commented 5 years ago

Thanks for the detailed report. I have also noticed sporadic failures with CSDP but haven't found the root cause yet. See for instance https://github.com/JuliaOpt/SumOfSquares.jl/issues/52 One idea could be to try to do something similar to https://github.com/JuliaOpt/Cbc.jl/pull/101 and see if it resolves the issue.

ericphanson commented 5 years ago

I am not sure if it is the same bug or not, but I also got a segfault using CSDP partway through an optimization with Pajarito, where CSDP is the continuous solver (using Gurobi as the MIP solver). It ran for a few minutes and I could see CSDP had solved many problems without incident during the course of the optimization, and then on one of the problems it immediately segfaulted. So it seems like a sporadic problem also. The same overall optimization problem worked fine when I substituted CSDP for Mosek. I'll include the stacktrace below in case it's helpful:

signal (11): Segmentation fault: 11
in expression starting at no file:0
op_a at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
pinfeas at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/deps/usr/lib/libcsdp.dylib (unknown line)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.h.jl:294
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.jl:62
unknown function (ip: 0x131a5c0da)
sdp at /Users/eh540/.julia/packages/CSDP/7O621/src/declarations.jl:43
optimize! at /Users/eh540/.julia/packages/CSDP/7O621/src/MPB_wrapper.jl:73
optimize! at /Users/eh540/.julia/packages/SemidefiniteModels/31KiO/src/sd_to_conic.jl:313
unknown function (ip: 0x131a582cb)
solve_subp! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1829
solve_subp_add_subp_cuts! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1707
callback_lazy at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1452
lazycallback at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/callbacks.jl:78
#130 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/callbacks.jl:96
mastercallback at /Users/eh540/.julia/packages/Gurobi/dlJep/src/MPB_wrapper.jl:706
unknown function (ip: 0x131a61b4b)
PRIVATE00000000005cbaca at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE000000000047f5f0 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE0000000000483efe at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000003ef899 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE0000000000443909 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000003c86c5 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000005b9298 at /usr/local/lib/libgurobi81.dylib (unknown line)
PRIVATE00000000005b8dcf at /usr/local/lib/libgurobi81.dylib (unknown line)
GRBoptimize at /usr/local/lib/libgurobi81.dylib (unknown line)
optimize! at /Users/eh540/.julia/packages/Gurobi/dlJep/src/grb_solve.jl:5
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
#solve#120 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:175
#solve at ./none:0 [inlined]
solve_mip_driven! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:1493
optimize! at /Users/eh540/.julia/packages/Pajarito/PL3Lt/src/conic_algorithm.jl:659
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
#solve#120 at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:175
unknown function (ip: 0x125fe4ad9)
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
solve at /Users/eh540/.julia/packages/JuMP/PbnIJ/src/solvers.jl:150
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
do_call at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:323
eval_stmt_value at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:362 [inlined]
eval_body at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:759
jl_interpret_toplevel_thunk_callback at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:885
unknown function (ip: 0xfffffffffffffffe)
unknown function (ip: 0x11a51590f)
unknown function (ip: 0xffffffffffffffff)
jl_interpret_toplevel_thunk at /Users/osx/buildbot/slave/package_osx64/build/src/interpreter.c:894
jl_toplevel_eval_flex at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:764
jl_toplevel_eval at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:773 [inlined]
jl_toplevel_eval_in at /Users/osx/buildbot/slave/package_osx64/build/src/toplevel.c:793
eval at ./boot.jl:328
eval_user_input at /Users/osx/buildbot/slave/package_osx64/build/usr/share/julia/stdlib/v1.1/REPL/src/REPL.jl:85
run_backend at /Users/eh540/.julia/packages/Revise/SOSpn/src/Revise.jl:842
#68 at ./task.jl:259
jl_fptr_trampoline at /Users/osx/buildbot/slave/package_osx64/build/src/gf.c:1864
jl_apply at /Users/osx/buildbot/slave/package_osx64/build/src/./julia.h:1571 [inlined]
start_task at /Users/osx/buildbot/slave/package_osx64/build/src/task.c:572
Allocations: 109787744 (Pool: 109751500; Big: 36244); GC: 219
Segmentation fault: 11