Matrix instance where UMFPACK 64 routines fails to factorize

WalterLu3 commented 7 months ago

Describe the bug I encountered an instance of matrix where UMFPACK 32-bit routine is unable to factorize it because of out-of-memory error. I then tried to use the UMFPACK 64-bit routine but it get stuck at the _numeric stage. I am able to factorize this matrix using other basis routines such as the ones provided by HiGHS. Not sure if it’s a bug, but I hope reporting this is helpful for you.

To Reproduce The following are the steps I do to reproduce the error. The binary file that contains the matrix is failed_matrix.dat. All the files mentioned here can be found in: https://github.com/WalterLu3/umfpack_edge_case

Step 1. Reproducing Out-Of-Memory error with UMFPACK 32-bit interface.

I compiled this by dynamically linking to libumfpack.dylib.

#include <stdio.h> //perror
#include "umfpack.h"
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

/*********************************************************************************
 * failed_matrix.dat is a binary that stores our instances.
 *  
 * failed_matrix.dat contains the matrix information in the 
 * following order :
 * 
 * 1. int size    (the size of the square matrix. i.e the number of rows and columns)
 * 2. int nnz     (number of non-zero element in our data matrix)
 * 3. int *Ai     (An array of length 'nnz' to store the row indices
 * 4. int *Ap     (An array of length 'size + 1')
 * 5. double *Ax  (An array of length 'nnz' to store matrix data)
 * 
 * 
 * So it stores everything based on UMFPACK's form.
 * *******************************************************************************/

/*********************************************************************************
 * This file does the following steps 
 * 1. Read in the matrix that fails to be factorized.
 * 2. Dynamically load the routines for factorization. (you need to input your
 *    UMFPACK_PATH to link to the dynamic libraray)
 * 3. Try to factorize it using int32 interface.
 * 4. Failed with Out-Of-Memory Error
 ********************************************************************************/

int main(int argc, char **argv){

    FILE *fptr;
    int status;
    int size;
    int nnz;
    int *Ai;
    int *Ap;
    double *Ax;
    double *null = (double *) NULL ;
    void *Symbolic, *Numeric ;

    /* read in matrix */

    /* open file*/
    fptr = fopen("failed_matrix.dat","r");
    if (fptr == NULL){
        perror("file open failed");
    }

    /* read in all matrix information */
    fread(&size, sizeof(int), 1, fptr);
    fread(&nnz, sizeof(int), 1, fptr);

    Ai = (int *)malloc(nnz*sizeof(int));
    Ap = (int *)malloc((size + 1)*sizeof(int));
    Ax = (double *)malloc(nnz*sizeof(double));

    fread(Ai, sizeof(int), nnz, fptr);
    fread(Ap, sizeof(int), size + 1, fptr);
    fread(Ax, sizeof(double), nnz, fptr);
    /* finish reading matrix*/

    /* start factorization step*/

    printf("run symbolic step\n");
    status = umfpack_di_symbolic(size, size, Ap, Ai, Ax, &Symbolic, null, null);
    if (status != 0){
        perror("symbolic");
    }

    printf("run numeric step\n");
    status = umfpack_di_numeric(Ap, Ai, Ax, Symbolic, &Numeric, null, null);
    if (status != 0){
        printf("status code : %d\n", status);
    }

    free(Ai);
    free(Ap);
    free(Ax);
    return 0;
}

It gives me an out-of-memory error. According to the UMFPACK user guide, I proceeded to use UMPACK 64 routines.

Step 2. Reproducing error with UMFPACK 64-bit never ending. I also compiled the following by dynamically linking to libumfpack.dylib.

#include <stdio.h> //perror
#include "umfpack.h"
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

#define Int64 long long

/*********************************************************************************
 * failed_matrix.dat is a binary that stores our instances.
 *  
 * failed_matrix.dat contains the matrix information in the 
 * following order :
 * 
 * 1. int size    (the size of the square matrix. i.e the number of rows and columns)
 * 2. int nnz     (number of non-zero element in our data matrix)
 * 3. int *Ai     (An array of length 'nnz' to store the row indices
 * 4. int *Ap     (An array of length 'size + 1')
 * 5. double *Ax  (An array of length 'nnz' to store matrix data)
 * 
 * 
 * So it stores everything based on UMFPACK's form.
 * *******************************************************************************/

/*********************************************************************************
 * This file does the following steps 
 * 1. Read in the matrix that fails to be factorized.
 * 2. Convert the matrix into int64 form. 
 * 3. Dynamically load the routines for factorization (int64 version). 
 *    (you need to input your UMFPACK_PATH to link to the dynamic libraray)
 * 4. Try to factorize it using int32 interface.
 * 5. numeric step never ends
 ********************************************************************************/

int main(int argc, char **argv){

    FILE *fptr;
    int status;
    int size;
    int nnz;
    int *Ai;
    int *Ap;
    int element;
    double *Ax;
    double *null = (double *) NULL ;
    void *Symbolic, *Numeric ;

    Int64 size_64;
    Int64 *Ai_64;
    Int64 *Ap_64;

    /* read in matrix */

    /* open file*/
    fptr = fopen("failed_matrix.dat","r");
    if (fptr == NULL){
        perror("file open failed");
    }

    /* read in all matrix information */
    fread(&size, sizeof(int), 1, fptr);
    fread(&nnz, sizeof(int), 1, fptr);

    Ai = (int *)malloc(nnz*sizeof(int));
    Ap = (int *)malloc((size + 1)*sizeof(int));
    Ax = (double *)malloc(nnz*sizeof(double));

    fread(Ai, sizeof(int), nnz, fptr);
    fread(Ap, sizeof(int), size + 1, fptr);
    fread(Ax, sizeof(double), nnz, fptr);
    /* finish reading matrix*/

    /* create 64-bits container */
    size_64 = (Int64) size;
    Ai_64 = (Int64 *)malloc(nnz*sizeof(Int64));
    Ap_64 = (Int64 *)malloc((size + 1)*sizeof(Int64));

    /* copy into 64-bits form */
    for (element = 1; element <= nnz; element++){
        Ai_64[element - 1] = (Int64) Ai[element - 1];
    }

    for (element = 1; element <= size + 1; element++){
        Ap_64[element - 1] = (Int64) Ap[element - 1];
    }

    /* start factorization step*/

    printf("run symbolic step\n");
    status = umfpack_dl_symbolic(size_64, size_64, Ap_64, Ai_64, Ax, &Symbolic, null, null);
    if (status != 0){
        perror("symbolic");
    }

    printf("run numeric step\n");
    status = umfpack_dl_numeric(Ap_64, Ai_64, Ax, Symbolic, &Numeric, null, null);
    if (status != 0){
        printf("status code : %d\n", status);
    }

    free(Ai);
    free(Ap);
    free(Ax);
    free(Ai_64);
    free(Ap_64);
    return 0;
}

The numeric steps seem to never stops.

Expected behavior I expected the numeric step would end for the UMFPACK 64bit interface.

Desktop (please complete the following information):

OS: macOS Ventura 13.4.1

compiler :

gcc --version output 
Apple clang version 14.0.3 (clang-1403.0.22.14.1)

UMFPACK version

otool -L libumfpack.dylib 
libumfpack.dylib:
    @rpath/libumfpack.6.dylib (compatibility version 6.0.0, current version 6.1.0)
    @rpath/libsuitesparseconfig.7.dylib (compatibility version 7.0.0, current version 7.0.1)
    /usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1319.100.3)
    @rpath/libamd.3.dylib (compatibility version 3.0.0, current version 3.0.3)
    /System/Library/Frameworks/Accelerate.framework/Versions/A/Accelerate (compatibility version 1.0.0, current version 4.0.0)
    @rpath/libcholmod.4.dylib (compatibility version 4.0.0, current version 4.0.3)
    @rpath/libcholmod_cuda.4.dylib (compatibility version 4.0.0, current version 4.0.3)
    @rpath/libcolamd.3.dylib (compatibility version 3.0.0, current version 3.0.3)
    @rpath/libcamd.3.dylib (compatibility version 3.0.0, current version 3.0.3)
    @rpath/libccolamd.3.dylib (compatibility version 3.0.0, current version 3.0.3)

Thanks.

DrTimothyAldenDavis commented 7 months ago

Thanks for the detailed reproducible code. I was able to run it here. I'll upload my files as a PR, if you like.

The matrix can be factorized, but when used as is stands, the matrix takes about 151 seconds on my 20-core desktop. It suffers a lot of fillin.

I tried pulling the matrix into MATLAB (which also uses umfpack), and I noticed something interesting. The matrix has 530,379 entries but only 448,277 nonzeros. That is, 82,102 of the entries are exactly zero. Normally that's not a problem, but I do treat those as legitimate entries (it can happen that subsequent matrices need those entries).

But when I tried it in MATLAB, the factorization took just 2 seconds, and L+U has just 19.6 million entries. MATLAB drops zeros by design.

If I add the zeros with values equal to epsilon (1e-16), the factorization takes 258 seconds in MATLAB (an older version of UMFPACK perhaps explains that difference, or a different BLAS library), and the factorization L+U has 1 billion entries. I think the epsilon values I added are causing trouble there. I'm not factorizing the same matrix in MATLAB as in C, in this case.

I don't see as many entries in L+U from the C code (where zeros are kept but equal to zero), but I still see a lot: about 90 million, and the time is 151 seconds.

So I would consider dropping the explicit zeros from A, if you can do that.

WalterLu3 commented 7 months ago

I'll try that and let you know what happens. Thanks.

DrTimothyAldenDavis commented 7 months ago

Here's my output with your original matrix:

typescript.txt

that's my output, with some added diagnostics, with this program:

loadUMFPACK_long_version.c.txt

and a build script: CMakeLists.txt

my MATLAB script: try_matlab.m.txt

and the output:

and its output: diary.txt

My desktop has 256GB of RAM, so it was able to handle this matrix.

WalterLu3 commented 7 months ago

Hi, after dropping the zeros. It is able to factorize with just 32-bit interface on my 16GB RAM machine!

#include <stdio.h> //perror
#include "umfpack.h"
#include <string.h>
#include <stdlib.h>
#include <ctype.h>

/*********************************************************************************
 * failed_matrix.dat is a binary that stores our instances.
 *  
 * failed_matrix.dat contains the matrix information in the 
 * following order :
 * 
 * 1. int size    (the size of the square matrix. i.e the number of rows and columns)
 * 2. int nnz     (number of non-zero element in our data matrix)
 * 3. int *Ai     (An array of length 'nnz' to store the row indices
 * 4. int *Ap     (An array of length 'size + 1')
 * 5. double *Ax  (An array of length 'nnz' to store matrix data)
 * 
 * 
 * So it stores everything based on UMFPACK's form.
 * *******************************************************************************/

/*********************************************************************************
 * This file does the following steps 
 * 1. Read in the matrix that fails to be factorized.
 * 2. Dynamically load the routines for factorization. (you need to input your
 *    UMFPACK_PATH to link to the dynamic libraray)
 * 3. Try to factorize it using int32 interface.
 * 4. Failed with Out-Of-Memory Error
 ********************************************************************************/

int main(int argc, char **argv){

    FILE *fptr;
    int status;
    int size;
    int nnz;
    int *Ai;
    int *Ap;
    double *Ax;
    int *Ai2;
    int *Ap2;
    double *Ax2;
    double *null = (double *) NULL ;
    void *Symbolic, *Numeric ;

    /* read in matrix */

    /* open file*/
    fptr = fopen("failed_matrix.dat","r");
    if (fptr == NULL){
        perror("file open failed");
    }

    /* read in all matrix information */
    fread(&size, sizeof(int), 1, fptr);
    fread(&nnz, sizeof(int), 1, fptr);

    Ai = (int *)malloc(nnz*sizeof(int));
    Ap = (int *)malloc((size + 1)*sizeof(int));
    Ax = (double *)malloc(nnz*sizeof(double));

    fread(Ai, sizeof(int), nnz, fptr);
    fread(Ap, sizeof(int), size + 1, fptr);
    fread(Ax, sizeof(double), nnz, fptr);
    /* finish reading matrix*/

    Ai2 = (int *)malloc(nnz*sizeof(int));
    Ap2 = (int *)malloc((size + 1)*sizeof(int));
    Ax2 = (double *)malloc(nnz*sizeof(double));

    int number_nnz = 0;
    int col_nnz;
    int colStart, colEnd;
    int start;
    /* drop zeros in the matrix */
    Ap2[0] = 0;
    for (int i = 0; i <=size; i++){
        col_nnz = 0;
        start = Ap[i];
        colStart = Ap[i];
        colEnd = Ap[i+1];
        while(colStart < colEnd){
            if (Ax[colStart] != 0){
                Ai2[number_nnz] = Ai[colStart];
                Ax2[number_nnz] = Ax[colStart];
                number_nnz++;
                col_nnz++;
            }
            colStart ++;
        }
        Ap2[i + 1] = Ap2[i] + col_nnz;
    }
    printf("%d,%d\n",Ap2[size + 1],number_nnz);
    printf("zeros : %d\n", number_nnz);

    /* start factorization step*/

    printf("run symbolic step\n");
    status = umfpack_di_symbolic(size, size, Ap2, Ai2, Ax2, &Symbolic, null, null);
    if (status != 0){
        perror("symbolic");
    }

    printf("run numeric step\n");
    status = umfpack_di_numeric(Ap2, Ai2, Ax2, Symbolic, &Numeric, null, null);
    if (status != 0){
        printf("status code : %d\n", status);
    }

    free(Ai);
    free(Ap);
    free(Ax);
    return 0;
}

Maybe HiGHS has done something for the zero elements in the matrix, I will probably look more into that when I have time. Thanks for the detailed explanation!

DrTimothyAldenDavis commented 7 months ago

Glad to help!

DrTimothyAldenDavis commented 7 months ago

Can you tell me more about this matrix? It could make for an interesting problem to add to my collection, as a counter-example.

WalterLu3 commented 7 months ago

Sure. I was doing work on adding new basis packages to the PATH solver, which is a specialized solver for the mixed complementarity problem. The basis packages are used to calculate the basic feasible direction for pivoting since the PATH solver adopts a method similar to the simplex method. When I was testing all the available packages in PATH with some problem sets, a problem failed (a dynamic equilibrium problem). The failed matrix I extracted corresponds to the basic feasible direction, $B^{-1}A_j$ of the pivot where things collapse.

The reason why it has zeros in the data is exactly what you said - it can happen that subsequent matrices need those entries (we indeed reuse the matrix structure). Essentially, we are doing things similar to extracting the columns of Jacobian of a nonlinear map to form a basis. When we are identifying the number of non-zero elements, we are identifying the entries that are "structurally nonzero." So an entry that has value $x_1x_2$ will be identified as non-zero, but the value may be zero because it depends on the given value of $x_1$ and $x_2$.

DrTimothyAldenDavis / SuiteSparse

Matrix instance where UMFPACK 64 routines fails to factorize #577