taikoxyz / zkevm-circuits

DEPRECATED in favor of https://github.com/taikoxyz/raiko! Taiko's fork of the PSE's ZK-EVM
Other
159 stars 125 forks source link

GPU prover #16

Open Brechtpd opened 2 years ago

Brechtpd commented 2 years ago

Look into using the GPU to speed up certain prover work:

Libraries:

mratsim commented 1 year ago

Others:

See also my quick analysis at: https://github.com/mratsim/constantine/issues/92

There are 2 additional backends that might be interesting:

Intel integrated GPUs also have unified memory but they are not powerful enough. In case we want to use those we need to wait for an LLVM version with SPIR-V that is not experimental otherwise LLVM needs to be built from source with a couple of other LLVM+SPIR-V translators.

hugo-blue commented 1 year ago

Look into using the GPU to speed up certain prover work:

  • FFT
  • MSM
  • Custom gates?

Libraries:

The evaluation part of lookup and permutation also deserve optimization.

hugo-blue commented 1 year ago

Others:

See also my quick analysis at: mratsim/constantine#92

There are 2 additional backends that might be interesting:

  • AMD GPUs, in particular because AMD offers significantly more memory than Nvidia, (see AMD teasing: https://community.amd.com/t5/gaming/building-an-enthusiast-pc/ba-p/599407) but they aren't available in cloud machines
  • Apple Metal, due to unified memory, Mac Studios and Mac pro can access up to 192GB of memory, enough to fit the super-circuit. However Metal Assembly is closed source, I tried to look into reverse engineering effort to at least find add-with-carry, either from Apple LLVM or Asahi Linux but I'm not hopeful.

Intel integrated GPUs also have unified memory but they are not powerful enough. In case we want to use those we need to wait for an LLVM version with SPIR-V that is not experimental otherwise LLVM needs to be built from source with a couple of other LLVM+SPIR-V translators.

As there are many Nvidia GPUs available in the crypto mining market. Focusing on Nvidia GPU should be enough.

For each zkp project, to reduce the time of data copy and save memory, there should be also a common memory management module for MSM, FFT and so on.

mratsim commented 1 year ago

As there are many Nvidia GPUs available in the crypto mining market. Focusing on Nvidia GPU should be enough.

The miners focused on megahash per watt first, which was dominated by AMD GPUs, then they used Nvidia GPUs. However, GPUs with large amount of VRAM consume more (and cost more) without it being useful for parallel SHA256 computation.

Concretely they bought a lot of AMD RX480 and Nvidia GTX 1080ti but those had only 8 and 11GB of RAM.

And nvidia is still gimping the RAM of its GPUs (there are AMD consumer GPUs with 24GB)

For each zkp project, to reduce the time of data copy and save memory, there should be also a common memory management module for MSM, FFT and so on.

Do you have an example of this? Even on CPUs.

hugo-blue commented 1 year ago

As there are many Nvidia GPUs available in the crypto mining market. Focusing on Nvidia GPU should be enough.

The miners focused on megahash per watt first, which was dominated by AMD GPUs, then they used Nvidia GPUs. However, GPUs with large amount of VRAM consume more (and cost more) without it being useful for parallel SHA256 computation.

Concretely they bought a lot of AMD RX480 and Nvidia GTX 1080ti but those had only 8 and 11GB of RAM.

And nvidia is still gimping the RAM of its GPUs (there AMD consumer GPUs with 24GB)

I see. So, there is a challenge to let low-end machines with GPUs like 1080 to do zkp proving.

For each zkp project, to reduce the time of data copy and save memory, there should be also a common memory management module for MSM, FFT and so on.

Do you have an example of this? Even on CPUs.

On CPUs, the system DDR is shared for all the computation, and no need to care about this. For GPU, there is limited memory, which is smaller than DDR, so memory management is essential.