zkmopro / mopro

Making client-side proving on mobile simple.
https://zkmopro.org
Apache License 2.0
131 stars 35 forks source link

Optimize Metal MSM Implementation for Enhanced GPU Utilization #153

Closed moven0831 closed 5 months ago

moven0831 commented 5 months ago

Problem

The current implementation of Metal MSM leverages GPU resources only partially, leading to suboptimal performance. To fully exploit GPU capabilities, we need to optimize the MSM process by addressing identified overheads and implementing enhancements.

Details

Current Implementation

The Metal MSM implementation currently derives from Lambdawork's metal backend and Arkworks' MSM implementation. Metal is used only for the MSM accumulation phase on the GPU, while other phases run on the CPU. This partial GPU utilization limits potential performance gains.

Identified Overheads

  1. Parsing points and scalars for GPU computation.
  2. Converting GPU-calculated buckets back to Arkworks-compatible types.

Optimization Goals

  1. Full GPU Implementation: Implement the entire MSM process on the GPU to reduce data conversion overhead by only parsing the final MSM result back to Arkworks-compatible types.
  2. Initialization Speed: Accelerate the initialization of points and scalars for GPU computation.
  3. Advanced Techniques: Apply optimization techniques such as precomputation of points.

By addressing these areas, we aim to significantly enhance MSM computation speed using Metal.

Acceptance Criteria

  1. The entire MSM process is implemented on the GPU, minimizing data conversion overhead.
  2. Initialization of points and scalars for GPU computation is significantly faster.
  3. The final implementation is compatible with Arkworks and passes all relevant tests.
vivianjeng commented 5 months ago

move to https://github.com/zkmopro/gpu-acceleration/issues/1