flintlib / flint

FLINT (Fast Library for Number Theory)
http://www.flintlib.org
GNU Lesser General Public License v3.0
428 stars 242 forks source link

Strassen multiplication does more multiplications than necessary #2063

Open fredrik-johansson opened 2 weeks ago

fredrik-johansson commented 2 weeks ago

It looks like in all our implementations of Strassen multiplication, we compute the bottom right entry twice when the number of rows of A and columns of B are both odd. This doesn't really matter for a 101x101 multiply, but it does matter for a 3x3 multiply with big entries where we currently actually do 29 entry multiplications instead of 26 (so here our Strassen is actually worse than classical which does 27).

fredrik-johansson commented 2 weeks ago

So basically instead of

    if (a > 2*anr)
    {
        fmpz_mat_t Ar, Cr;
        fmpz_mat_window_init(Ar, A, 2*anr, 0, a, b);
        fmpz_mat_window_init(Cr, C, 2*anr, 0, a, c);
        fmpz_mat_mul(Cr, Ar, B);
        fmpz_mat_window_clear(Ar);
        fmpz_mat_window_clear(Cr);
    }

we should do

    if (a > 2*anr)
    {
        fmpz_mat_t Ar, Br, Cr;
        fmpz_mat_window_init(Ar, A, 2*anr, 0, a, b);
        fmpz_mat_window_init(Br, B, 0, 0, b, 2*bnc);
        fmpz_mat_window_init(Cr, C, 2*anr, 0, a, 2*bnc);
        fmpz_mat_mul(Cr, Ar, Br);
        fmpz_mat_window_clear(Ar);
        fmpz_mat_window_clear(Br);
        fmpz_mat_window_clear(Cr);
    }

(maybe with a separate case for when Br = B, to avoid the extra window matrix allocation)