cybercongress / go-cyber

Your 🔵 Superintelligence
https://cyb.ai
Other
352 stars 87 forks source link

Optimization of CUDA kernel for multiple GPUs #342

Open cyborgshead opened 5 years ago

cyborgshead commented 5 years ago

We have network limits which linked to the speed of processing (rank window) and size of the graph (onboard memory of GPU/GPUs)

Mainnet will be started with a pretty big rank's calculation window (>=100 blocks) and a small amount of network bandwidth which will provide time to kernel upgrade by the community and also do hardware upgrades by validators.

Now, much time take to prepare data before sending it to GPU and we have only single GPU CUDA PageRank algorithm implementation.

My proposal to starts with stand-alone optimized kernel and redefine data structures during performance research and implementation of multiple GPUs kernel. Then make refactoring of structures in cyberd and migrate to the new kernel.

References: https://github.com/cybercongress/cyberd/issues/229

Note:

  1. Single host, single GPU <-----We are here----->
  2. Single host, multiple GPUs (x16 PCI Express)
  3. Multiple hosts, multiple GPUs
serejandmyself commented 5 years ago

I re-wrote your task slightly - as per how I understand it. It would be good if you could correct me if I got it wrong.

In any case, I also added some (maybe) useful links to some research and articles:

Current situation:

Problem:

Task:

Desired outcome:

Steps to solve:

What might be needed (?):

Some articles and research that might be useful: (not all might be useful)