Closed copybara-service[bot] closed 3 months ago
Simplify matmul: only 2 overloads
Also add StoreHorizontalSumsMaybeAdd wrapper function, move MatMulSlowBatch into test.
1.02-1.06x speedup.
Simplify matmul: only 2 overloads
Also add StoreHorizontalSumsMaybeAdd wrapper function, move MatMulSlowBatch into test.
1.02-1.06x speedup.