Open z24tao opened 2 years ago
gemm: C = alpha A @ B + beta C intermediate result of alpha * A @ B is currently passed to the next kernel as individual scalars, instead of being vectorized
gemm: C = alpha A @ B + beta C intermediate result of alpha * A @ B is currently passed to the next kernel as individual scalars, instead of being vectorized