GraphBLAS / graphblas-api-c

Other
7 stars 3 forks source link

named and defined operators and types #32

Closed DrTimothyAldenDavis closed 9 months ago

DrTimothyAldenDavis commented 3 years ago

GrB_Type_new (&type, sizeof (ctype)) is too limited. I need to add 2 strings to this: a name, and a definition.

For example:

typedef struct { float stuff [4] ; int wonkiness } mywonkytype ;
GrB_Type_new (&MyWType, sizeof (mywonkytype), "mywonkytype", "typedef struct { float stuff [4] ; int wonkiness } mywonkytype ;") ;

This way, I have the name of the type. This can be used for inspecting the type of a matrix:

char typename [GxB_MAX_NAME_LEN] ;
GxB_Matrix_type_name (typename, A) ;

would fill the array typename with the null-terminated string "mywonkytype". We can also have a function that queries a seralized blob for the name of the type of the matrix it contains. Then, a process doing a deserialize could do the following (the function name GxB_blob_type_name is a placeholder):

char typename [GxB_MAX_NAME_LEN] ;
GxB_blob_type_name (typename, blob, blobsize).
if (strcomp (typename, "mywonkytype" ) == 0)
{
    // I recognize this type, this is MyWType:
    GrB_Matrix_deserialize (&A, blob, blobsize, MyWType)
}

With the 2nd string, in this case, "struct { float stuff [4] ; int wonkiness }", I can create CPU and GPU kernels at runtime, with a JIT. Without these 2 strings, I cannot write kernels on the GPU to do work with any user-defined types. A GPU cannot call a user-defined function pointer, for example.

I also would like to propose adding these 2 strings to all methods that create operators: say we have a user-defined operator that computes a boolean result, z = (x > 3) where x is int32 (yes, this can be done as a built-in but this is just a simple example):

void myunopfunc (void *z, const void *x) { (*(bool *) z = (*((int32 *) x) > 3) ; }
...
GrB_UnaryOp op ;
GxB_UnaryOp_new (&op, myunopfunc, GrB_BOOL, GrB_INT32, "myunopfunc",
"void myunopfunc (void *z, const void *x) { (*(bool *) z = (*((int32 *) x) > 3) ; }") ;

Note the 2nd string is the entire copy of the user-function, just as a string. This 2nd string would be optional; it would only be required if the user wants to be fast and if I do a JIT to make it fast.

My existing GxB_Matrix_type (&type, A) method is not a good idea since the type returned is only useful for the process that created the type. Instead, we should be able to query a matrix for its type, by asking for a string that contains the C type. BUilt-in types would return "bool", "int8_t", ... "float", "double" etc, to match the return value for user-defined types (the C name of the typedef, mywonkytype).

Named types are essential for querying the type of a matrix or serialized blob. The definition of the type, as a string, will be essential to get user-defined types to work just as fast as built-in types, and this also requires named and defined user-defined operators, with these 2 strings.

I don't want to write a compiler, so the content of the 2nd string should only be parsed by a compiler, not by GraphBLAS. But GraphBLAS can then use that string to build a file, compile it, link it in, and call it. The resulting performance would match built-in types and operators.

DrTimothyAldenDavis commented 2 years ago

This method is required for #66.

DrTimothyAldenDavis commented 1 year ago

I am now successfully using these pairs of strings (the name of a type and its typedef, or the name of an operator and its C/C++ function), for our CUDA kernels in the master branch of SuiteSparse/GraphBLAS. The strings are injected into a header file and provided to the JIT compiler at run time.

These strings could also be added after the type or operator is constructed, with GrB_get/set methods.

rayegun commented 1 year ago

Type names will be added by get/set in 2.1, required in the constructor for 3.0.

The typedef itself and the function name/function def will be impl specific GrB_get/set fields!