Layr-Labs / eigenda

Secure, high-throughput, and decentralized Data Availability
220 stars 168 forks source link

[1/N][GPU encoder] Add benchmarking code and refactor encoding module #715

Closed dmanc closed 1 month ago

dmanc commented 1 month ago

Why are these changes needed?

Adds benchmarking code under encoding/bench. Run make benchmark_cpu to run a benchmark with the default settings. The command outputs a file called benchmark_results.json that has a list of encode times for each run.


    flag.StringVar(&config.OutputFile, "output", "benchmark_results.json", "Output file for results")
    flag.Uint64Var(&config.BlobLength, "blob-length", 1048576, "Blob length (power of 2)")
    flag.Uint64Var(&config.NumChunks, "num-chunks", 8192, "Minimum number of chunks (power of 2)")
    flag.Uint64Var(&config.NumRuns, "num-runs", 10, "Number of times to run the benchmark")
    flag.StringVar(&config.CPUProfile, "cpuprofile", "", "Write CPU profile to file")
    flag.StringVar(&config.MemProfile, "memprofile", "", "Write memory profile to file")
    flag.BoolVar(&config.EnableVerify, "enable-verify", false, "Verify blobs after encoding")

The PR also refactors the code in order to support GPU based components in the future. Separating out the ideas present in into multiple PRs.

Got the following result on a g6.4xlarge (only using the CPU code)

| Encoded Blob Size | Num Chunks | Chunk Len | Encoding time (avg 10 runs) | Dominant factor |
| 32768             | 8192       | 1         | 12.774s                   | Multiproof fft1 |
| 65536             | 8192       | 2         | 12.853s                   | Multiproof fft1 |
| 131072            | 8192       | 4         | 12.969s                   | Multiproof fft1 |
| 262144            | 8192       | 8         | 13.099s                   | Multiproof fft1 |
| 524288            | 8192       | 16        | 13.360s                   | Multiproof fft1 |
| 1048576           | 8192       | 32        | 13.765s                   | Multiproof fft1 |
| 2097152           | 8192       | 64        | 14.496s                   | Multiproof fft1 |
| 4194304           | 8192       | 128       | 15.803s                   | Multiproof fft1 |
| 8388608           | 8192       | 256       | 18.043s                   | Multiproof fft1 |
| 16777216          | 8192       | 512       | 24.041s                   | Multiproof msm  |
| 33554432          | 8192       | 1024      | 29.168s                   | Multiproof msm  |

In addition, at the larger blob sizes the reed solomon encoding also becomes a dominant factor which suggest we should focus on accelerating ComputeMultiFrameProof and ExtendPolyEval.
