Coarsening updates - Githubissues

Rewrote parts of coarsening algorithm to reflect that there is only one coarsening step, and improved clarity of comments

About the space optimization for the coarsening method:

In order to improve the space usage from O(n + e) to O(e), the following changes must occur: (1) Do not compute a full vector of length num_nodes (2) The node_parent vector must NOT have entries for singletons (3) If preserve_mapping = false and we need to compute a compressed parent vector in Parent_to_S, then this compressed parent vector must NOT have entries for singletons

(2) can be done by examining the E matrix. In particular, singleton nodes will have empty rows in E. We can run a GrB_reduce on E and use the result as a mask to identify non-singleton nodes. Now, we can change the GrB_apply used to populate empty entries in node_parent to use this non-singleton mask as the input instead of a full vector.

(3) is trickier. Everything in the Parent_to_S function up until the GrB_extract can be kept as is. Up until this point, we have computed the compressed parent for all nodes that:

Are preserved in the coarsened graph
Are NOT singletons Note that this intermediate result is also the new labels vector that tells us how preserved nodes are relabeled in the coarsened graph. In the space-optimized version, this new labels vector will NOT include new labels for singletons. The user must appropriately handle this. See the explanation below under "Potential easy solution to space optimization" for why this is OK.

The original purpose of the GrB_extract is to populate the compressed parents for discarded nodes (nodes that disappear in the coarsened graph). However, it only works because the parent vector passed as input into Parent_to_S is full, and we are OK with writing into all entries of the compressed parent vector. Neither of these are now true. In particular, the indices of the input parent vector can be partitioned into three parts:

Indices that correspond to preserved, non-singleton nodes (we already have the compressed parents for these; they are their own parents) (present)
Indices that correspond to discarded, non-singleton nodes (present)
Indices that correspond to singleton nodes (empty)

Our goal is to populate the compressed parents for discarded, non-singleton nodes. A possible strategy is the following:

Unpack the...
WIP, this gets pretty ugly, see below instead.

Potential easy solution to space optimization: Eliminate singleton nodes upfront; we will build a new A matrix corresponding to the input graph without singleton nodes. This can be done by using a reduce_to_vector on the input matrix to identify singletons, then using the contents of that vector to GrB_extract from the input matrix into our singleton-free matrix. We will then run our coarsening as usual on this new matrix. We will have new outputs for the user to provide information about the singleton removal; see comments at the top of LAGraph_Coarsen_Matching for more details.

Note: This approach involves a GrB_extract to pull rows/cols from the original A matrix, which seems to be very slow.

GraphBLAS / LAGraph

Coarsening updates #236

About the space optimization for the coarsening method: