macro_rules! c {
($i:expr, $j:expr) => (c.offset(rsc * $i as isize + csc * $j as isize));
}
// C ← α A B + β C
let mut c = [_mm256_setzero_ps(); MR];
let betav = _mm256_set1_ps(beta);
if beta != 0. {
// Read C
if csc == 1 {
loop_m!(i, c[i] = _mm256_loadu_ps(c![i, 0]));
// Handle rsc == 1 case with transpose?
} else {
loop_m!(i, c[i] = _mm256_set_ps(*c![i, 7], *c![i, 6], *c![i, 5], *c![i, 4], *c![i, 3], *c![i, 2], *c![i, 1], *c![i, 0]));
}
// Compute β C
loop_m!(i, c[i] = _mm256_mul_ps(c[i], betav));
}
What I assume happens (must happen): at the definition site of the c! macro, the variable c refers to the *mut T passed in as a function argument to the kernel. Later on though, c refers to the MR-sized array containing simd packed f32 8-vectors.
For a reader trying to get familiar with the code base, this shadowing of the variables is enormously confusing and should probably be changed.
What's happening in this chunk of code is enormously confusing: https://github.com/bluss/matrixmultiply/blob/master/src/sgemm_kernel.rs#L301-L318:
What I assume happens (must happen): at the definition site of the
c!
macro, the variablec
refers to the*mut T
passed in as a function argument to the kernel. Later on though, c refers to theMR
-sized array containing simd packed f32 8-vectors.For a reader trying to get familiar with the code base, this shadowing of the variables is enormously confusing and should probably be changed.