Closed pranav-prakash closed 3 years ago
Even in the old version the compiler did loop unswitching for us so we computed the strides once inside j-loop (and with branch prediction the effect is negligible). Nonetheless, this version is cleaner since we remove branching entirely.
Even in the old version the compiler did loop unswitching for us so we computed the strides once inside j-loop (and with branch prediction the effect is negligible). Nonetheless, this version is cleaner since we remove branching entirely.