JuliaLang / julia

The Julia Programming Language
https://julialang.org/
MIT License
45.65k stars 5.48k forks source link

Intermittent segfault in solving positive-definite sparse systems #2623

Closed dmbates closed 11 years ago

dmbates commented 11 years ago

Mostly so that I have a record of the problem - I have experienced intermittent segfaults when solving positive-definite sparse systems using cholfact which calls CHOLMOD functions through the code in base/linalg/cholmod.jl.

I haven't isolated the problem yet but it seems to be related to using the LL' form of the Cholesky factorization, as opposed to the LDL' form. For symmetric sparse matrices the CHOLMOD code allows either type with the default being LDL'. When I use the LL' form I sometimes get segfault during subsequent solves. I added a test in `test/suitesparse.jl that uses the LL' form but the test doesn't segfault reliably for me.

The segfault is usually preceded by a CHOLMOD error message of an invalid xtype (a field in the cholmod_* structs). For example, using the matrix A and vector B defined in test/suitesparse.jl

julia> chma = cholfact(A,true)                 # LL' form

CHOLMOD factor:  :  48-by-48
  scalar types: int, real, double
  simplicial, LL'.
  ordering method used: AMD
         0:41
         1:25
         2:26
         3:27
         4:9
         5:39
         6:21
         7:44
    ...
        44:37
        45:38
        46:47
        47:7
  col: 0 colcount: 8
  col: 1 colcount: 5
  col: 2 colcount: 5
  col: 3 colcount: 7
  col: 4 colcount: 8
  col: 5 colcount: 8
  col: 6 colcount: 8
  col: 7 colcount: 10
    ...
  col: 44 colcount: 4
  col: 45 colcount: 3
  col: 46 colcount: 2
  col: 47 colcount: 1
monotonic: 1
 nzmax 489.
  col 0: nz 8 start 0 end 8 space 8 free 0:
         0: 46574
        34: 44.731
        37: 2147.1
        40: 149.1
        42: 17893
        43: 8.9463
        44: -149.1
        46: -53.678
    ...
  col 45: nz 3 start 483 end 486 space 3 free 0:
        45: 241.23
        46: 2095.4
        47: 11.905
  col 46: nz 2 start 486 end 488 space 2 free 0:
        46: 15647
        47: 14.614
  col 47: nz 1 start 488 end 489 space 1 free 0:
        47: 916.96
  nz 489  OK

julia> sparse(chma)
48x48 sparse matrix with 489 nonzeros:
    [1 ,  1]  =  46574.3
    [35,  1]  =  44.7314
    [38,  1]  =  2147.11
    [41,  1]  =  149.105
    [43,  1]  =  17892.6
    [44,  1]  =  8.94628
    [45,  1]  =  -149.105
    ⋮
    [47, 45]  =  -246.005
    [48, 45]  =  -1.8493
    [46, 46]  =  241.233
    [47, 46]  =  2095.4
    [48, 46]  =  11.9048
    [47, 47]  =  15647.2
    [48, 47]  =  14.6144
    [48, 48]  =  916.962

julia> isvalid(chma)
true

julia> chma\B
CHOLMOD error: invalid xtype

Process julia segmentation fault (core dumped) at Wed Mar 20 10:02:58 2013

This was using

julia> versioninfo()
Julia Version 0.2.0-659.r413f
Commit 413fbf9adb 2013-03-20 10:17:32
Platform Info:
  OS_NAME: Linux
Using: (64-bit interface)
  Blas: libopenblas
  Lapack: libopenblas
  Libm: libopenlibm
ViralBShah commented 11 years ago

@dmbates Is this still an issue? I believe that the latest iteration of your cholmod wrapper solved many such issues.

dmbates commented 11 years ago

@ViralBShah I haven't seen this behavior for a long time. Let's close the issue and cross our fingers :-)