This PR is an update and extension of half data support in portBLAS and includes following changes :
half support is enabled using the cmake option BLAS_ENABLE_HALF and is only applied to operators meant to support half according to oneMKL spec (so far in this PR axpy, scal and gemm)
unittests & benchmarksare extended to support mixed-precision comparison (reference BLAS libs only support float/double).
Extended unittests for axpy, scal, and gemm (+gemm_batched) using half.
Extended portblas, cublas & rocblas benchmarks for gemm (+gemm_batched).
Separated gemm configurations when using half data type for each TUNING_TARGET from the float/double configurations.
Other notes :
half precision support is disabled when targetting DEFAULT_CPU due to lack of fp16 support.
some legacy gemm configurations for intel GPU targets with sycl::half have been removed (not based on a tuning but rather a temporary reduction of generated kernels)
This PR is an update and extension of half data support in portBLAS and includes following changes :
BLAS_ENABLE_HALF
and is only applied to operators meant to support half according to oneMKL spec (so far in this PR axpy, scal and gemm)TUNING_TARGET
from the float/double configurations.Other notes :
DEFAULT_CPU
due to lack of fp16 support.