Closed teor2345 closed 5 months ago
It looks like the manual implementation is 60% faster, so we don't need to change our code here. But I'll still add benchmarks for both versions, in case other code changes make the library code faster.
Manual implementation:
Benchmarking Cyclotomic multiplication: polynomial/Random input: Collecting 50 samples in estimated 5
Cyclotomic multiplication: polynomial/Random input
time: [53.911 ms 54.057 ms 54.237 ms]
change: [-0.6539% -0.1373% +0.3285%] (p = 0.61 > 0.05)
No change in performance detected.
Found 4 outliers among 50 measurements (8.00%)
1 (2.00%) high mild
3 (6.00%) high severe
divide_with_q_and_r()
call:
Benchmarking Cyclotomic multiplication: polynomial/Random input: Collecting 50 samples in estimated 8
Cyclotomic multiplication: polynomial/Random input
time: [88.762 ms 88.957 ms 89.216 ms]
change: [+63.894% +64.562% +65.174%] (p = 0.00 < 0.05)
Performance has regressed.
Found 4 outliers among 50 measurements (8.00%)
2 (4.00%) high mild
2 (4.00%) high severe
In #9 we want to optimise polynomial multiplication.
We currently have a manual modulus implementation, but the existing library code for divide_with_q_and_r() might be faster.
Tasks: