SLATE is a distributed, GPU-accelerated, dense linear algebra library targetting current and upcoming high-performance computing (HPC) systems. It is developed as part of the U.S. Department of Energy Exascale Computing Project (ECP).
After looking at things, I felt like the best way to implement slate_Options was to make it just a pointer to the C++ options object and provide accessor functions to read/write the various options. The slate_Options_create and slate_Options_destroy functions match the create/destroy functions for slate_Matrix/slate_Pivots/etc.
The options are set as, e.g., slate_Options_set_Lookahead( slate_Options opts, int64_t value ) and get as, e.g., slate_Options_get_Lookahead( slate_Options opts, int64_t defval ). So, normal usage would be
// Execute on GPU devices with lookahead of 2.
slate_Options opts = slate_Options_create();
slate_Options_set_Target( opts, slate_Target_Devices );
slate_Options_set_Lookahead( opts, 2 );
slate_multiply_c32( alpha, A, B, beta, C, opts );
slate_Options_destroy( opts );
To help simplify things, passing NULL as the options gets converted into an empty options array. This helps simplify cases where the defaults are good enough, as in most of our examples:
// Execute on Host with lookahead of 1.
slate_multiply_c32( alpha, A, B, beta, C, NULL );
Alternative approaches:
An array of Union types (i.e., what we had been doing).
We'd need to do something else for the Fortran API, but it'd be preferable to provide the same interface to both C and Fortran
The manually managed length is more tedious and is a point for user error
A struct of all the types
We'd still need a create function to set the defaults correctly
We'd need to document all the default values. And, I'm not certain how we'd have a default value for the bool options since assigning any non-zero value to C's bool will always assign 1.
Other improvements
Added a version of get_option that takes the Option argument as a template parameter and returns the expected type, instead of relying on the caller to specify the correct type.
I needed that mapping to generate all the getters and setters for the C/Fortran API anyways, so I figured I'd get the most out of it.
I replaced the get_option calls in LU (all 3) and Choleksy to make sure it was working.
Moved c_api/util.hh to src instead of include/slate. It's just internal functions, so it shouldn't be included in our external API.
Added BaseMatrix::num_devices() to the C/Fortran APIs (I needed it to only use Target::Devices in the gemm example when there are GPUs present.)
The C++ BLAS example was hardcoded to double precision, in spite of the functions being templated. So, I fixed that
Added two C examples and one Fortran example to examples/c_api and examples/fortran, respectively. Additionally, these examples were added to the CI tests.
After looking at things, I felt like the best way to implement
slate_Options
was to make it just a pointer to the C++ options object and provide accessor functions to read/write the various options. Theslate_Options_create
andslate_Options_destroy
functions match the create/destroy functions forslate_Matrix
/slate_Pivots
/etc.The options are set as, e.g.,
slate_Options_set_Lookahead( slate_Options opts, int64_t value )
and get as, e.g.,slate_Options_get_Lookahead( slate_Options opts, int64_t defval )
. So, normal usage would beTo help simplify things, passing
NULL
as the options gets converted into an empty options array. This helps simplify cases where the defaults are good enough, as in most of our examples:Alternative approaches:
create
function to set the defaults correctlybool
options since assigning any non-zero value to C'sbool
will always assign1
.Other improvements
get_option
that takes the Option argument as a template parameter and returns the expected type, instead of relying on the caller to specify the correct type.get_option
calls in LU (all 3) and Choleksy to make sure it was working.src
instead ofinclude/slate
. It's just internal functions, so it shouldn't be included in our external API.BaseMatrix::num_devices()
to the C/Fortran APIs (I needed it to only useTarget::Devices
in the gemm example when there are GPUs present.)examples/c_api
andexamples/fortran
, respectively. Additionally, these examples were added to the CI tests.