GraphBLAS / graphblas-api-c

Other
7 stars 3 forks source link

Extract C<M> += A(I,J) where I and J are GrB_Vectors #67

Open DrTimothyAldenDavis opened 2 years ago

DrTimothyAldenDavis commented 2 years ago

GrB_extract has several limitations that need to be fixed. These changes are needed for the Connected Compenents algorithm in LAGraph, and likely other graph algorithms. Consider C += A(I,J), where the mask M and accumulator (+) are not part of this discussion; no need to change those. So just consider C = A(I,J).

(1) GrB_Vectors as index arrays: the method for expressing row/column indices is via non-opaque GrB_Index arrays, I and J as I call them. This is not a good fit for algorithms that use GraphBLAS since they will likely be computing these index arrays with GraphBLAS itself, as GrB_Vectors. Then to pass the vectors to GrB_extract, they must use extractTuples first. This is slow, requiring an extra copy and also forcing a "block". If the inputs are GrB_Vectors I and J, then no copy needs to be made. As an example of this extractTuples, see the LG_CC_Boruvka ( https://github.com/GraphBLAS/LAGraph/blob/reorg/src/algorithm/LG_CC_Boruvka.c ) and LG_CC_FastSV6 ( https://github.com/GraphBLAS/LAGraph/blob/reorg/src/algorithm/LG_CC_FastSV6.c ). In particular, these 2 lines of code: https://github.com/GraphBLAS/LAGraph/blob/03cae9363f123dfe5572065a602b9c14ed4169e6/src/algorithm/LG_CC_Boruvka.c#L217 . Currently, all the examples I know of that want to use GrB_extract compute I and J with GraphBLAS and then they must use extractTuples before calling GrB_extract.

(2) Colon "notation": There is no method for expressing the index range I = lo:stride:hi. This could be done with a GrB_Index array of size 3 (I do this already) or a GrB_Vector of size 3, with a descriptor setting. See issue #15.

(3) Integer indices acting in a bollean sense: We need a method for using I and J in a mask-like fashion (not to be confused with M in the C= ... part; this is different). So consider just C=A(I,J). The GrB_Vectors I and J must have size nrows and ncols, respectively. C and A have the same dimension. The computation for any given entry A(i,j) would be:

if (I(i) is true and J(j) is true) then C(i,j) = A (i,j) else C (i,j) is not an entry.

This kind of computation is needed in the two CC methods in LAGraph. The statement "I(i) is true" could be valued, as written, or structural, as in "I(i) exists in the structure of the vector I", just like the value vs structural use of the mask M. This kind of extraction cannot be done by GrB_select, unless it is with a user-defined operator that then dereferences the arrays I and J on its own (see for example: https://github.com/GraphBLAS/LAGraph/blob/03cae9363f123dfe5572065a602b9c14ed4169e6/src/algorithm/LG_CC_Boruvka.c#L254 , which is slow).

An elegant extension of this boolean indexing idea would add a binary operator. The statement above:

for each entry A(i,j) that exists in the matrix A:
    if (I(i) is true and J(j) is true) then C(i,j) = A (i,j) else C (i,j) is not an entry.

could be done this way instead. The user passes in the GrB_LAND operator as the op, and the computation becomes:

for each entry A(i,j) that exists in the matrix A:
    if (op(I(i),J(j)) then C(i,j) = A (i,j) else C (i,j) is not an entry.

If the vectors I and J are sparse, with missing entries, then op(I(i),J(j)) would only be applied in an "eWiseMult" sense; that is, only used if I(i) and J(j) are both present, and the result of this test would be false otherwise.

This kind of extract would be simple to implement and incredibly useful. The LG_CC_FastSV6 method needs to delete all entries in the matrix that are in the current largest connected component. See https://github.com/GraphBLAS/LAGraph/blob/03cae9363f123dfe5572065a602b9c14ed4169e6/src/algorithm/LG_CC_FastSV6.c#L430 . That entire body of code could be replaced (mostly) with:

GrB_apply (I, NULL, NULL, GrB_NE_UINT64, parent, key, NULL) ;    // no extensions here; pure GrB. key is a scalar.
GxB_extract (T, NULL, NULL, GrB_LAND, A, I, I, NULL) ;  // using the boolean index vector I

The above 2 lines of code do the following computation:

for each entry A(i,j)
     if (parent (i) != key AND parent (j) != key) then T(i,j) = A(i,j) else T(i,j) does not appear in the matrix T

The body of code in LG_CC_FastSV6 also adds an extra term: T(i,key) is inserted as a new entry if the row T(i,:) differs from A(i,:). That can be done in several ways, as a separate step, so ignore it for this example. The primary work is the 2 lines above.

Consider also the user-defined index_unary operator in LG_CC_Boruvka: it returns true if Px(i) != Px(j), where Px is passed in as a void * pointer. A better method would be:

GxB_extract (S, NULL, NULL, GrB_NE_UINT64, parent, parent, NULL) ;

which computes the exact result desired. The line above deletes all edges (i,j) where node i and node j appear in the same connected component. That is, it does the following:

for each entry S(i,j)
     if (parent (i) != parent (j)) then S(i,j) = S (i,j) (keep the entry) else S(i,j) does not appear in the matrix S on output

That is, it does the following:

for each entry S(i,j)
    if (parent (i) != parent(j)) then keep the entry S(i,j) else delete it.

The expression "parent (i) != parent (j)" is specified by passing in the GrB_NE_UINT64 operator, and using the GrB_Vector parent for both inputs I and J, to compute S = S(I,J).

(4) Mask-like boolean index vectors: There is one more way to use these two vectors I and J. Assume they are already boolean (just like the mask M is treated). We have the ability to complement a mask M. I and J act like a Cartesian product, and it would be nice to complement that too. Or handle any combination of complements and boolean operations between I and J.

So imagine a mask-style vector MI used in a structural sense: MI(i) = true if I(i) is present, and false otherwise. Or let MI be constructed in a valued sense where MI(i) is true if I(i) is present and has the value true, and false otherwise. This is exactly how a mask M is handled already in GraphBLAS, but this is now used to interpret the index GrB_Vector I for GxB_extract.

But since there are TWO mask vectors MI and MJ, then cannot simply by taken as-is or complemented. For a conventional mask M there are two unary functions to implicitly apply: complement the mask, or do not complement the mask. With TWO mask vectors MI and MJ, there are 16 boolean possible functions to apply, z = f(x,y) where z,x,y are all boolean.

The descriptor would say how I and J are two be interpretted: as (1) plain index vectors (see (1) above), or (2) as "colon notation " (see (2) above), as inputs to an arbitrary binary operator (see (3) above), or as this mask-like behavior. If mask-like then the statement:

GxB_extract (C, M, accum, binaryop, A, I, J, descriptor)

would compute the following matrix T (later doing C+=T for the standard mask/accum phase):

Construct the mask vector MI from I, just like we would already for GrB_Vector_apply (v, I, ...) for a vector I.
Do the same to construct MJ from J.
for each entry A(i,j)
     if (binaryop (MI(i), MJ(j)) then T(i,j) = A(i,j) else T(i,j) does not appear as an entry in T

Then binaryop would be any of 16 possible binary operators, like GrB_LAND, GrB_XOR, GrB_XNOR, you name it. Sadly, the suite of boolean binary operators in GraphBLAS is not complete. We do not have all 16 of them. We are missing NAND(x,y) and NOR(x,y) for example. See this table for details: https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/ccb8d243f1bb3ab9668f25011b01634eb7af53b5/Include/GraphBLAS.h#L1444

DrTimothyAldenDavis commented 2 years ago

The first step for considering this for the spec is for me to implement as GxB_extract. That will allow me to test out and refine the design of the API for this method.

DrTimothyAldenDavis commented 2 years ago

The mask-like feature, (4) in the list above, would be very useful in many contexts. Here's an example.

Consider a huge graph with n = 2^60 nodes, and a hypersparse adjacency matrix G.

Suppose we have a very sparse vector I, where I(k) = true if node k has some property, and no other entries appear. This implicitly divides the matrix into 2 sets of nodes: (A) those with the property and (B) those without, but creating a full boolean vector with n entries is too costly.

Say we want to create a new graph C where C contains just those edges (i,j) where either I(i) is true, or I(j) is true, but not both. That is, construct C that has all edges that cross the gap from set (A) and set (B).

GrB_Descriptor_set (desc, ... something here to get mask-like index vector behavior ...) GxB_extract (C, NULL, NULL, GrB_XOR, G, I, I, NULL)

It would be impossible to do this with GrB_extract. It would require the construction of an explicit list for the vector I (which is easy, using extractTUples), but it would also require the construction of the list of indices NOT in this list, of size almost 2^60. That is impossible.

Just like complementing a mask is cheap, and doesn't require an explicit "~M" to be built, so too does this "XOR (I,I)" Cartesian index set can be dealt with implicitly.

The only way to do this in GraphBLAS now is to use GrB_Matrix_export, or GrB_Matrix_extractTuples, then do the work yourself. That takes time and space of Theta(nvals(G)), while the GrB_extract can do the work in Omega(nvals(C)) which might be just a bit more than Theta(nvals(C)) if I can be clever enough in its implementation. And nvals(C) << nvals(G) << nrows(G) can easily hold.

GrB_select with a user-defined operator would also be too slow. It must apply the operator to all entries in G, taking O(nvals(G)) time. (This is why LG_CC_Boruvka is currently much slower than LG_CC_FastSV6 in the LAGraph draft, as of Nov 26, 2021).

DrTimothyAldenDavis commented 1 year ago

This is something RedisGraph needs as well.

DrTimothyAldenDavis commented 1 year ago

"GxB_extract" is not the right name for this method, since it would be confused with GrB_extract. I'm not proposing a change to GrB_select or GrB_extract. I think a new name would be needed for this "select via Cartesian mask" idea/algorithm.

temporalengineer commented 11 months ago

What is the best (fastest) way to remove a list of nodes from a large graph without changing the dimensionality of the matrix.

The adjacency matrix is unweighted with 1.0 representing a connection (and no entry preferred or 0.0) if the nodes are not connected.

I’m: 1) applying the page_rank_operator (LAGraph) to a graph with 40,000 nodes

2) then removing a selection of nodes from the graph
    Using either GrB_Matrix_removeElement or GrB_Matrix_assign_FP64 assigning the value a 0.0

3) then applying page_rank_operator again to the same graph.

This process is repeated multiple times to converge to a result..

The code to remove the nodes is something like this:

for some node_id in a collection of nodes:

GxB_Iterator iterator{nullptr};
GxB_Iterator_new(&iterator);
// attach it to the adjacency matrix
GxB_Matrix_Iterator_attach(iterator, the_matrix, GrB_NULL;

// seek to the first entry
info = GxB_Matrix_Iterator_seek(iterator, 0);

GrB_Index i{0};
GrB_Index j{0};
while (info != GxB_EXHAUSTED) {
    // Get indexes for the entry the_matrix(i,j)
    GxB_Matrix_Iterator_getIndex(iterator, &i, &j);
    if ((i == node_id) || (j == node_id)) {
    //          GrB_Matrix_removeElement(the_matrix, i, j);         This much slower than
        GrB_Matrix_assign_FP64(the_matrix, GrB_NULL, GrB_PLUS_FP64, 0.0, &i, 1, &j, 1, GrB_NULL);

    }
    // move to the next entry in the_matrix
    info = GxB_Matrix_Iterator_next(iterator);
}
GxB_Iterator_free(&iterator);

I’d like to remove the elements from the graph, since assigning 0.0 does generate slightly different results. But assigning the the node to 0.0 is much faster than removing the nodes.

And I need to maintain the size of the matrix and the node ID assignments so I can map the results back to a set of externally managed labels

DrTimothyAldenDavis commented 11 months ago

The best method is probably to use GrB_select with your own user-defined GrB_IndexUnaryOp.

Create a GrB_Scalar type that is a pointer to bool. Then create a dense bool vector (call it Keep, let's say), Then create a GrB_Scalar that is a pointer to this dense bool vector.

Keep [i] = true if you want to keep row /column i.

The GrB_IndexUnaryOp gets passed a "thunk" value, along with the row i and column j of the entry being considered. It must return true if the entry is to be kept.

So the op should return true if Keep [i] is true and if Keep [j] is true. Something like :

typedef struct { bool *Keep_pointer ; } mykeep_t ;

void mykeeper (bool *z, const double *x, GrB_Index i, GrB_Index j, const mykeep_t *y)
{
    bool *Keep = y->Keep_pointer ;
    (*z) = Keep [i] && Keep [j] ;
}

Be sure to use GxB_Type_new and GxB_IndexUnaryOp_new for these objects, to ensure that the JIT can be used.

This is pretty safe and doesn't require any global values. It is a little bit dangerous because the pointer dereferencing to the user Keep array could get changed if aggressive kernel fusion is done inside GraphBLAS, but I don't do that yet so this is safe.

You can see examples of this kind of method in LAGraph:

experimental/algorithm/LAGraph_msf.c experimental/algorithm/LAGraph_scc.c src/algorithm/LG_CCBoruvka.c (this is the most robust example, but use GxB*new to enable the JIT)

Since the usage is very common, we need something new in the C API to handle this more elegantly.

DrTimothyAldenDavis commented 11 months ago

p.s. You can't use the matrix GxB iterator. Those methods require the matrix you're iterating over to not change, but you're changing it. That's not safe to do. Changing a matrix in any way invalidates the GxB iterator.

Even if the iterator would work, it's fundamentally sequential while GrB_select is parallel.

DrTimothyAldenDavis commented 11 months ago

Another way to look at this problem is to consider a "Cartesian mask". If P is a vector of size nrows (perhaps sparse, and Q a GrB_Vector of size ncols, then the implicit mask is PxQ, using some kind of 2-input or 3-input operator x. I think this kind of select method would need its own name, such as GxB_select_Cartesian. It could also take a standard mask M.

GxB_select_Cartesion (C, M, accum, op, A, P, Q, y, descriptor) The op would have to be a new kind of operator with the following signature (just like IndexUnaryOp but with 2 new parameters):

void fcartesian
(
    void *z,           // output of type ztype (just like IndexUnaryOp)
    const void *x,     // the input value A(i,j), just like IndexUnaryOp
    const void *p,     // the input value of P(i), or NULL if no entry, of type ptype
    const void *q,     // the input values of Q(j), or NULL if no entry, of type qtype
    GrB_Index i,       // the row index (just like IndexUnaryOp)
    GrB_Index j,       // the col index (just like IndexUnaryOp)
     const void *y.    // the "thunk" scalar, just like the current IndexUnaryOp, type ytype
) ;

This operator is just like the GrB_IndexUnaryOp, except that it adds p and q.

The use of PxQ can be complemented, just like the mask complement. That is, in the default case, only entries in the pattern of P*Q' would make it through to the operator and all other entries are dropped. P and/or Q can be structurally complemented (not value!). So for example, to pick rows not in the pattern of P, but in Q, use (not P)xQ. In that case, the operator is passed p of NULL and q as non-NULL. Entries in a row i where P(i) is present in the pattern of P would not get selected at all and the operator would not be called.

This change should be considered at the same time as the future GrB_IndexBinaryOp operator, since that operator and this one could mutually affect each other's design.

It's not trivial to add this functionality, but I see a need for it in so many places. It's not feasible to compute a Mask matrix as the outer product P*Q' (treating P and Q as column vectors) since that matrix could be completely dense.

This new operator would only be used by GxB_select_Cartesian, with the signature:

GxB_select_Cartesion (GrB_Matrix C, GrB_Matrix M, GrB_BinaryOp accum,
    GxB_CartesianOp, GrB_Matrix A, GrB_Vector P, GrB_Vector Q, GrB_Scalar y,
   GrB_Descriptor desc) ;

For vectors, all matrices above would be vectors. P is still a vector but Q becomes a GrB_Scalar. The descriptor would be augmented with new options: PCOMPLEMENT and QCOMPLEMENT, which would handle the structural complements of P and Q. Note that y must be a GrB_Scalar (not a million variants with C scalars ...!).

Since this is a kind of select operation, the name GxB_select_Cartesian seems reseaonable. But it could also be called GxB_cartesian.

michelp commented 10 months ago

@DrTimothyAldenDavis Isn't this "cartesian mask" mechanism what we were talking about in our call today?

DrTimothyAldenDavis commented 10 months ago

No, it was the creation of a mapping/permutation matrix, from a vector. This is what is done here. It's super fast and looks simple but it's super crazy to wrap your mind around:

https://github.com/GraphBLAS/LAGraph/blob/7887f54875d5659e701809d623031fe0afd0aa0c/src/algorithm/LG_CC_FastSV6.c#L95-L132