bluss / matrixmultiply

General matrix multiplication of f32 and f64 matrices in Rust. Supports matrices with general strides.
https://docs.rs/matrixmultiply/
Apache License 2.0
209 stars 25 forks source link

Don't shadow c in sgemm_kernel::kernel_x86_avx #32

Closed SuperFluffy closed 5 years ago

SuperFluffy commented 5 years ago

What's happening in this chunk of code is enormously confusing: https://github.com/bluss/matrixmultiply/blob/master/src/sgemm_kernel.rs#L301-L318:

macro_rules! c {
    ($i:expr, $j:expr) => (c.offset(rsc * $i as isize + csc * $j as isize));
}
// C ← α A B + β C
let mut c = [_mm256_setzero_ps(); MR];
let betav = _mm256_set1_ps(beta);
if beta != 0. {
    // Read C
    if csc == 1 {
        loop_m!(i, c[i] = _mm256_loadu_ps(c![i, 0]));
    // Handle rsc == 1 case with transpose?
    } else {
        loop_m!(i, c[i] = _mm256_set_ps(*c![i, 7], *c![i, 6], *c![i, 5], *c![i, 4], *c![i, 3], *c![i, 2], *c![i, 1], *c![i, 0]));
    }
// Compute β C
    loop_m!(i, c[i] = _mm256_mul_ps(c[i], betav));
}

What I assume happens (must happen): at the definition site of the c! macro, the variable c refers to the *mut T passed in as a function argument to the kernel. Later on though, c refers to the MR-sized array containing simd packed f32 8-vectors.

For a reader trying to get familiar with the code base, this shadowing of the variables is enormously confusing and should probably be changed.

bluss commented 5 years ago

Hehe, ok since it was written in different passes, never noticed. Array c should be named cv.