UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
313 stars 109 forks source link

Problem for test result #125

Closed didiHe closed 2 years ago

didiHe commented 2 years ago

I 'm confused with the below parameters: BabelStream Version: 4.0 Implementation: OpenCL Running kernels 100 times Precision: double Array size: 268.4 MB (=0.3 GB) Total size: 805.3 MB (=0.8 GB) Using OpenCL device 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz Driver: 1.2.0.37 Reduction kernel config: 4 groups of size 8 Function MBytes/sec Min (sec) Max Average
Copy 11408.651 0.04706 0.07711 0.05327
Mul 11465.756 0.04682 0.07867 0.05287
Add 12715.411 0.06333 0.12243 0.07119
Triad 12743.574 0.06319 0.10276 0.07145
Dot 14336.124 0.03745 0.13272 0.07375

  1. What does array size and total size mean?
  2. Reduction kernel config?
  3. Min(sec),what is the time used for?
tomdeakin commented 2 years ago

BabelStream allocates 3 arrays: A, B and C. The Array size printed is the size of each array, in MB (base 10; there is a command line option to switch to base 2). The Total size is the total memory footprint, i.e. 3 times the array size. You can set the array size by setting the number of elements in the array on the command line with the --arraysize option.

The reduction kernel config refers to how many work-items and work-groups the dot kernel uses. The device is queried to work out a "good" configuration that performs well: https://github.com/UoB-HPC/BabelStream/blob/e21134d53814147595aa2d96fcb94800b77a35dc/src/ocl/OCLStream.cpp#L113-L123 The dot kernel is then launched using these parameters: https://github.com/UoB-HPC/BabelStream/blob/e21134d53814147595aa2d96fcb94800b77a35dc/src/ocl/OCLStream.cpp#L258

The time is used for calculating the sustained memory bandwidth. The output is designed to mirror that from the original STREAM benchmark on which this is based.