Summary

This PR experiments with accelerating Arkworks MSM on the CPU using precomputation techniques. The result shows that while window-wise MSM with precomputation provides a slight speedup, the increased I/O and serialize/deserialize overhead are not worth it.

Check on this report for more detail: https://hackmd.io/v8WPAG8-RsCANrbv_seSXw

Status

Since the goal is to speed up MSM computation, we tried both GPU and CPU ways to accelerate it. The GPU way is recorded in this issue https://github.com/zkmopro/mopro/issues/153.

This PR

Implement the generation for precomputed point and related serde methods. Adapted the Arkworks MSM algorithm to use the precomputed points.

benchmark result (on MacBook with M3 chip)

similar to Arkworks MSM, sometimes are slightly faster but not significant

msm time (ms) / instance size	2^10	2^16	2^18	2^20	2^22	2^24
Arkworks msm	2.68	67.96	245.38	912.32	3641.81	14254.36
Window-wise MSM with precomputation (on CPU)	2.13	65.83	234.32	915.91	x	x

The relationship among acceleration speed, size of precomputation points and time for precomputation is as follow:

2^10: precompute_factor=20 (i.e. precomputed point size increase 20 times)
2^16: precompute_factor=20
2^18: precompute_factor=19
2^20: precompute_factor=17 (Size of precomputed points is around 570 MB)

zkmopro / mopro

Feat/window-wise msm with precomputation acceleration on CPU #187

Summary

Status

This PR

benchmark result (on MacBook with M3 chip)