UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
323 stars 110 forks source link

Bandwidth reporting is broken on Julia #194

Open tom91136 opened 5 months ago

tom91136 commented 5 months ago

Got this on GH200, actual value should be ~3TB/s:

BabelStream
Version: 5.0
Implementation: Julia; src/CUDAStream.jl
Running kernels 100 times
Precision: double
Array size: 268.4 MB(=0.3 GB)
Total size: 805.3 MB(=0.8 GB)
Using CUDA device: GH200 120GB (CuDevice(0))
Kernel parameters: <<<32768,1024>>>
Init: 4.85317 s (=165.93425 MBytes/sec)
Read: 1.90727 s (=422.2296 MBytes/sec)
Function    MBytes/sec  Min (sec)   Max         Average     
Copy        2.979008268e60.00018     0.34157     0.00513     
Mul         2.890738861e60.00019     0.16855     0.00352     
Add         3.301423655e60.00024     0.15224     0.00315     
Triad       3.314891033e60.00024     0.18389     0.00335     
Dot         1.837655013e60.00029     1.26299     0.01428 

The bandwidth and min column is too close so 2.979008268e6 looked like 2.979008268e60, it shouldn't use scientific notation as well.