Open kevaundray opened 3 weeks ago
This PR is mainly being used to figure out the magnitude to which parallelism helps -- if we want significant improvements, it will need to be in the MSM algorithm
Results -- baseline is single threaded master:
computing cells_and_kzg_proofs - NUM_THREADS: Single
time: [180.78 ms 181.35 ms 182.08 ms]
change: [-10.891% -10.454% -10.017%] (p = 0.00 < 0.05)
Performance has improved.
computing cells_and_kzg_proofs - NUM_THREADS: Multi(4)
time: [57.467 ms 57.642 ms 57.840 ms]
change: [-72.102% -71.859% -71.645%] (p = 0.00 < 0.05)
Performance has improved.
computing cells_and_kzg_proofs - NUM_THREADS: Multi(8)
time: [40.380 ms 40.771 ms 41.192 ms]
change: [-81.162% -80.477% -79.991%] (p = 0.00 < 0.05)
Performance has improved.
computing cells_and_kzg_proofs - NUM_THREADS: Multi(16)
time: [43.413 ms 43.995 ms 44.684 ms]
change: [-78.833% -78.526% -78.173%] (p = 0.00 < 0.05)
Performance has improved.
computing cells_and_kzg_proofs - NUM_THREADS: Multi(32)
time: [49.036 ms 50.404 ms 52.169 ms]
change: [-76.070% -75.382% -74.463%] (p = 0.00 < 0.05)
Performance has improved.
Note: that the single threaded version has been improved because we the fft algorithm was changed.
This makes the BatchToeplitz operations parallelized and also makes the fft parallelized and also iterative