jluttine / suitesparse

PLEASE SEE THE OFFICIAL REPOSITORY. THIS IS NOT MAINTAINED ANYMORE.
https://github.com/DrTimothyAldenDavis/SuiteSparse
Other
93 stars 44 forks source link

Sparse constructs return uninitialized elements #6

Closed szarnyasg closed 5 years ago

szarnyasg commented 5 years ago

Hi everyone,

I am working on graph analytics with my Master's student, @Lehel44.

When using GraphBLAS, we encountered some strange behaviour, documented in the code snippet below. We are beginners with GraphBLAS, so it's possible that we use the API incorrectly -- please advise on what changes we should make. We are aware that this is not an official repository, but it might be a good place to start.

Thank you.

#include "GraphBLAS.h"

void printDoubleVector(GrB_Vector vec) {
    GrB_Index size;
    GrB_Vector_size(&size, vec);
    double element;
    for(unsigned int i = 0; i < size; i++) {
        GrB_Vector_extractElement(&element, vec, i);
        printf("%lf ", element);
    }
    printf("\n");
}
void printDoubleMatrix(GrB_Matrix M) {
    GrB_Index size;
    GrB_Matrix_nrows(&size, M);
    double element;
    for (unsigned int i = 0; i < size; i++) {
        for (unsigned int j = 0; j < size; j++) {
            GrB_Matrix_extractElement(&element, M, i, j);
            printf("%lf ", element);
        }
        printf("\n");
    }
    printf("\n");
}

int main (int argc, char **argv) {

    GrB_Index size = 4;
    GrB_Vector v = NULL;

    // Create index and value arrays to build a vector.
    GrB_Index* indices = malloc(size * sizeof(GrB_Index));
    double* values = malloc(size * sizeof(double));

    for (unsigned int i = 0; i < size; i++) {
        indices[i] = i;
        values[i] = 10.0;
    }

    // Initialize and build the vector from tuples.
    GrB_Vector_new(&v, GrB_FP64, size);
    GrB_Vector_build(v, indices, values, size, GrB_FIRST_FP64);

    // Print vector.
    printf("v:\n");
    printDoubleVector(v); // 10.000000 10.000000 10.000000 10.000000

    // Set 2nd element to 20 and print again.
    GrB_Vector_setElement(v, 20.0, 2);
    printDoubleVector(v); // 10.000000 10.000000 20.000000 10.000000

    // -------------------------------------------------------------------------------------
    // This works as it's supposed to. Let's see what happens if we don't build the vector.
    // -------------------------------------------------------------------------------------

    GrB_Vector w = NULL;

    printf("w:\n");
    // Initialize and print.
    GrB_Vector_new(&w, GrB_FP64, size);
    printDoubleVector(w); // 0.000000 0.000000 0.000000 0.000000

    // Set 2nd element to 20 and print again.
    GrB_Vector_setElement(w, 20.0, 2);
    printDoubleVector(w); // 20.000000 20.000000 20.000000 20.000000

    // -------------------------------------------------------------------------------------
    // All elements are changed to 20.0. Why?
    // Now create and build a 4-length vector with 2 tuples.
    // -------------------------------------------------------------------------------------

    GrB_Vector r = NULL;

    // Create index and value arrays to build a vector.
    unsigned int r_tuples = 2;
    GrB_Index* r_indices = malloc(r_tuples * sizeof(GrB_Index));
    double* r_values = malloc(r_tuples * sizeof(double));

    r_indices[0] = 0;
    r_values[0] = 11.4;

    r_indices[1] = 2;
    r_values[1] = 85.3;

    // Initialize, build and print.
    GrB_Vector_new(&r, GrB_FP64, size);
    GrB_Vector_build(r, r_indices, r_values, r_tuples, GrB_FIRST_FP64);
    printf("r:\n");
    printDoubleVector(r); // 11.400000 11.400000 85.300000 85.300000

    // -------------------------------------------------------------------------------------
    // What is happening?
    // -------------------------------------------------------------------------------------

    // Set 2nd element to 20 and print again.
    GrB_Vector_setElement(r, 20.0, 2);
    printDoubleVector(r); // 11.400000 11.400000 20.000000 20.000000

    // -------------------------------------------------------------------------------------
    // What is happening again? In bfs5m.c, lines 46-47 show that vectors can be used
    // by simple initialization, and there is no need to build them.
    // https://github.com/jluttine/suitesparse/blob/306d7f11792aca524571f4ae142bcd8b7e38c362/GraphBLAS/Demo/Source/bfs5m.c#L46-L47
    // -------------------------------------------------------------------------------------

    // Now build a 4x4 matrix from tuples.
    //
    // M = 1 2 10.0
    //     2 3 20.0
    //

    GrB_Matrix M = NULL;

    unsigned int M_tuples = 2;
    GrB_Index* MI = malloc(M_tuples * sizeof(GrB_Index));
    GrB_Index* MJ = malloc(M_tuples * sizeof(GrB_Index));
    double* M_values = malloc(M_tuples * sizeof(double));

    MI[0] = 1;
    MJ[0] = 2;
    M_values[0] = 20.0;

    MI[1] = 2;
    MJ[1] = 3;
    M_values[1] = 10.0;

    // Initialize and build matrix.
    GrB_Matrix_new(&M, GrB_FP64, size, size);
    GrB_Matrix_build(M, MI, MJ, M_values, M_tuples, GrB_FIRST_FP64);

    // Print the matrix.
    printf("M:\n");
    printDoubleMatrix(M);

    //  0.000000  0.000000  0.000000  0.000000
    //  0.000000  0.000000 20.000000 20.000000
    // 20.000000 20.000000 20.000000 10.000000
    // 10.000000 10.000000 10.000000 10.000000

    // -------------------------------------------------------------------------------------
    // Random values appear again. Now do a matrix-vector multiplication.
    //
    // w looks like: 20.000000 20.000000 20.000000 20.000000
    // w should be:   0.000000  0.000000 20.000000  0.000000
    // -------------------------------------------------------------------------------------

    GrB_Vector result = NULL;
    GrB_Vector_new(&result, GrB_FP64, size);

    GrB_mxv(result, NULL, NULL, GxB_PLUS_TIMES_FP64, M, w, NULL);

    // Result:
    // 0.000000 400.000000 400.000000 400.000000
    // It looks like it's counting something else.
    printDoubleVector(result); // 0.000000 400.000000 400.000000 400.000000

    GrB_free(&v);
    GrB_free(&w);
    GrB_free(&r);

    free(indices);
    free(values);
    free(r_indices);
    free(r_values);
    free(MI);
    free(MJ);
    free(M_values);
}
swilly22 commented 5 years ago

As a good practice start by initialising graphBLAS assert(GrB_init(GrB_NONBLOCKING) == GrB_SUCCESS);

When debugging or in general check return value from GraphBLAS calls as it might give information why an operation failed.

I'm not sure why but GraphBLAS won't update element to ZERO when entry [i,j] is not set, and so element would cary its previous value.

`void printDoubleVector(GrB_Vector vec) {

GrB_Index size;

GrB_Vector_size(&size, vec);

double element;

for(unsigned int i = 0; i < size; i++) {

    GrB_Vector_extractElement(&element, vec, i);

    printf("%lf ", element);

    element = 0.0;
}

printf("\n");

}`

DrTimothyAldenDavis commented 5 years ago

I’ll take a look and get back to you shortly.

Tim Davis

DrTimothyAldenDavis commented 5 years ago

GrB_extractElement does not return anything in &element if the entry is not present. It does not return zero. It leaves the values in 'element' unmodified. It returns an info value:

info = GrB_Vector_extractElement ( ... )

with info equal to GrB_NO_VALUE.

It can't return zero in the scalar 'element'. The value of the entry that is not present in the matrix or vector is not zero, but the identity value of the additive monoid. That could be anything. It is -INFINITY if the monoid is 'max', for example. It is unknown entirely.

So the value of element is undefined on the 2nd usage. It happens to be 20, but that is likely just what's randomly at that memory location from the last call to the printDoubleVector.

I will send you a fix to your code soon.

DrTimothyAldenDavis commented 5 years ago

Here is the fixed code:

fixed.c.txt

DrTimothyAldenDavis commented 5 years ago

The fixed.c just checks the return value of the extractElement functions and prints 'no value' if the return value is GrB_NO_VALUE.

Here is the correct output:

fixed_out.txt

MATLAB does the same thing, but it only has two semirings, and for both of them, the identity value of the additive monoid is zero:

trythis.m.txt

trythis_diary.txt

DrTimothyAldenDavis commented 5 years ago

One thing that would help would be a "GxB_print" method that would print the contents of a GraphBLAS object. That would have clarified your problem. I've seen others write their own 'print matrix' function, too. It's a common need. I have my own functions, but they are internal and "can't" be called by the user.

However, if you do

include "GraphBLAS/Source/GB.h"

and also put -IGraphBLAS/Source in your compile step, then do

GB_check (A, "my matrix A is:", level) ;

where level = 0 (nothing), 1 (print a terse line), 2: print 30 entries or so, 3: print it all. I plan on adding a user interface to that function. It's not in the spec.

Lehel44 commented 5 years ago

Thank you for the quick response and correction. I changed the print method as you suggested and it works correctly now, it solved our problem.