Closed didiHe closed 2 years ago
BabelStream allocates 3 arrays: A, B and C. The Array size printed is the size of each array, in MB (base 10; there is a command line option to switch to base 2). The Total size is the total memory footprint, i.e. 3 times the array size.
You can set the array size by setting the number of elements in the array on the command line with the --arraysize
option.
The reduction kernel config refers to how many work-items and work-groups the dot
kernel uses. The device is queried to work out a "good" configuration that performs well: https://github.com/UoB-HPC/BabelStream/blob/e21134d53814147595aa2d96fcb94800b77a35dc/src/ocl/OCLStream.cpp#L113-L123
The dot kernel is then launched using these parameters:
https://github.com/UoB-HPC/BabelStream/blob/e21134d53814147595aa2d96fcb94800b77a35dc/src/ocl/OCLStream.cpp#L258
The time is used for calculating the sustained memory bandwidth. The output is designed to mirror that from the original STREAM benchmark on which this is based.
I 'm confused with the below parameters: BabelStream Version: 4.0 Implementation: OpenCL Running kernels 100 times Precision: double Array size: 268.4 MB (=0.3 GB) Total size: 805.3 MB (=0.8 GB) Using OpenCL device 11th Gen Intel(R) Core(TM) i5-1135G7 @ 2.40GHz Driver: 1.2.0.37 Reduction kernel config: 4 groups of size 8 Function MBytes/sec Min (sec) Max Average
Copy 11408.651 0.04706 0.07711 0.05327
Mul 11465.756 0.04682 0.07867 0.05287
Add 12715.411 0.06333 0.12243 0.07119
Triad 12743.574 0.06319 0.10276 0.07145
Dot 14336.124 0.03745 0.13272 0.07375