The HPC toolbox: fused matrix multiplication, convolution, data-parallel strided tensor primitives, OpenMP facilities, SIMD, JIT Assembler, CPU detection, state-of-the-art vectorized BLAS for floats and integers
277
stars
15
forks
source link
AVX512 GEMM kernel #14
Closed
mratsim closed 5 years ago
Initial support of AVX512