Closed viv-eth closed 4 months ago
The new implementation_t let's you now select which kernel variant you want to run. The supported flavors include:
implementation_t
BASELINE
NAIVE
NAIVE_UNROLLED
OPT
OPT_EX
The new
implementation_t
let's you now select which kernel variant you want to run. The supported flavors include:BASELINE
: Assembly-based baseline kernel (FP32, FP16, FP8)NAIVE
: Simple for-loops for computing the GEMM (FP64, FP32, FP8)NAIVE_UNROLLED
: for-loops with loop unrolling (FP32)OPT
: Optimized kernels leveraging SSRs and FREP (FP64, FP32, FP16)OPT_EX
: Optimized low-precision kernels with expanding to the next higher precision (FP16, FP8)