GraphBLAS / graphblas-api-c

Other
7 stars 3 forks source link

"Query everything": a summary of many related issues. #66

Closed DrTimothyAldenDavis closed 7 months ago

DrTimothyAldenDavis commented 2 years ago

Mechanisms for querying GraphBLAS objects, state, and library info. The following exists, which is a start:

#define GRB_VERSION 2
#define GRB_SUBVERSION 0
GrB_Info GrB_getVersion (&version, &subversion)

It possible to query a matrix for nrows, ncols, and nvals, but not its type. That is a problem. An application needs to be able to query everything about GraphBLAS opaque objects.

CATEGORY 1: all results would go into user-visible, non-opaque arrays.

user-visible results are better for many uses, such as saving data to a file, or sending data across an MPI channel, where pointers to things (like a GrB_Type) cannot be shared:

(1.1) query the library implementation, date, and version, both at run time and with #defines. Issue: #2

(1.2) query the mode passed to GrB_init (GrB_BLOCKING, GrB_NONBLOCKING). Issue #4.

(1.3) query the domains (types) of operators: Issue #5. This should be returned as a string. Returning a GrB_Type is not a good idea. Note that this requires user-defined types to have a name. The query would return the name of the C type, as in "bool", "int8_t", ... "float", "double". For user-defined types it would return the name of the C typedef; see issue #32.

(1.4) query the size of a GrB_Type, returning a size_t. (See #6)

(1.5) query a GrB_Matrix, GrB_Vector, GrB_Scalar, a serialized_data from GrB_Matrix_serialize for its type. This should be returned as a string ("bool", "int8_t", ... "float", "double"). LAGraph has an ugly work-around the lack of this feature in GraphBLAS; this is a MUST-have.

(1.6) query the identity value of monoid, returning the result into a uint8_t array perhaps, of size given by (4). See #7

(1.7) query the state of a GrB_Descriptor; see #13.

**CATEGORY 2: These query functions can also be added; they return a GrB object, not a user-visible non-opaque result.** They might be useful but not essential. Items 2.2, 2.3 and 2.3 (Issue #7) would be needed if for consistency with the ability to query all other GrB objects, however.

(1.1) given the name of a type as string, return the corresponding GrB_Type. Return NULL if the type is user-defined. Note that this can also be written by the user. I have a GxB_Type_from_name function that does this.

(1.2) query the monoid of a semiring (see #7)

(1.3) query the multiplicative operator of a semiring (#7)

(1.4) query the binary op of a monoid, returning a GrB_BinaryOp (#7)

DrTimothyAldenDavis commented 2 years ago

This issue is painful.

I'm trying to write an experimental LAGraph method that can write a set of matrices to a single file, and read them back again. I have an elegant design using a JSON header (in ascii), followed by the binary "blob" coming from GrB_Matrix_serialize. The file will be library-specific, but the JSON header will indicate the library used, and its version.

But I can't tell what library or version I'm using because the C API does not have this feature (see #2).

When writing a set of matrices to the file, I pass in an array GrB_Matrix *A, where A [k] is the kth matrix. Then all I need is to pass in the number of matrices ... except (argh!) I also must know the matrix type so I can write it to the JSON file (otherwise, GrB_Matrix_deserialize won't know the type).

I need to be able to query a matrix for its type, returning a string ("double" for a GrB_FP64 matrix, etc), using my GxB_Matrix_type_name function. Otherwise, I have to pass in an array of GrB_Type, say Types [k], of size equal to the # of matrices. Good luck if the matrix type in Types [k] doesn't match the type of matrix A [k] ... things would break badly.

This is really really hard. Querying the type of a matrix needs to be added to the C API. We have a clumsy hack in LAGraph, where the LAGraph_Graph G has both G->A and G->A_type. The typical user writing a package that relies on GraphBLAS will be very frustrated at the lack of a matrix-type-query method in GraphBLAS.

rayegun commented 1 year ago

A brief proposal, which I will expand on in another issue/strawman proposal:

This (along with some of the JIT issues) seems well within the functionality of the get and set functionsfound in GxB (and occasionally in GrB for Descriptors).

This might look like GrB_Type_set(GrB_NAME, "MyUDT"), with an equivalent get function. If we maintain GrB_<...>_set and get functions for each GraphBLAS type (Matrix, Type, BinaryOp) we can standardize metadata access for all implementations, even if those implementations don't define some metadata key.

This achieves a couple things:

  1. Removes a pretty significant GxB extension from SuiteSparse by making GxB_get and GxB_set official.
  2. Enables implementers to define their own metadata as SuiteSparse does already
  3. Enables implementers to gracefully decline certain introspection/reflection functionality by returning something like GrB_KEY_NOT_FOUND. From a user perspective I would rather handle an error like this than an entirely missing function (if for instance one implementation wants named BinaryOps and one does not).

Some of the proposal above, like getting a type from a name, would obviously not fit neatly here.