GraphBLAS / graphblas-api-c

Other
7 stars 3 forks source link

Adding more formats to GrB_Format #83

Open rayegun opened 8 months ago

rayegun commented 8 months ago

In trying to implement a "GrB_only" mode a gigantic headache is Matrix_import and Matrix_export. Full is an obvious inclusion, and the doubly compressed formats are fairly well known at this point.

I think it's feasible for us to allow an implementation to return GrB_NOT_IMPLEMENTED for doubly compressed without any issue. But lack of Full is a real footgun I feel (unless I'm missing the "obvious" way to do this).

eriknw commented 8 months ago

Yeah, this is missing, and awkward to do. python-graphblas has a vanilla version of Matrix.from_dense (and to_dense) that uses CSR: https://github.com/python-graphblas/python-graphblas/blob/522b696e157a0fb00b63a6a846913e075106a1d5/graphblas/core/matrix.py#L1419-L1434

rayegun commented 8 months ago

What's the performance like relative to a dense copy? I bet not ideal.

Because of this I've dropped a GrB only version for now. I may try again after working on IndexBinary

DrTimothyAldenDavis commented 8 months ago

I have placeholders for GrB_DENSE_ROW_FORMAT and such:

https://github.com/DrTimothyAldenDavis/GraphBLAS/blob/4f242bf71b936b96f1a866eabd9446fbabda656e/Source/GrB_Matrix_export.c#L96

I got it to work but of course didn't add it because it's not in the spec. Exporting a matrix in dense format would be a single parallel memcpy (if it's in the right format already). If the matrix had any entries missing, my proposal was that an error would be returned (GrB_INVALID_VALUE), but the API committee didn't like that suggestion and so decided to just not add the option of a GrB_Matrix_import / export of a dense matrix.

If the matrix was not in a dense format already, or if held by row and the export is by-column, then I make a copy. I could have relaxed the former case but didn't get to it. That is, if a matrix is in sparse compressed-column format with all entries present, and a GrB_DENSE_COL_FORMAT is requested for export, then it could be just a single parallel memcpy. If the matrix is iso-valued, then I expand my single iso value into the numerical array being exported, since there's no iso-export.

So if you ask me to export a matrix in CSR format, but its in a dense row format, I just make a temporary copy, convert it in place to CSR, and then memcpy that to the user output (yes, that's a double or triple copy that I could avoid, but I decided that was not worth optimizing since the API needs to be fixed anyway).

DrTimothyAldenDavis commented 8 months ago

One option that might not break the API much:

if you export in CSR format but pass in NULL pointers for some of the outputs, then those are not exported. Currently that is an error.

Then, if you know the matrix has all entries present, you could just export the values and not the pattern (by passing in NULL for the Ap and Ai components, as I call them). End result is the export of a dense matrix held by row.