UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
323 stars 110 forks source link

OpenMP target overflow issues? #140

Closed jeffhammond closed 2 years ago

jeffhammond commented 2 years ago

I am using ROCM 5 on MI-100.

Fortran looks great with 64-bit integer indexing:

jehammond@mi100:~/BabelStream/src/fortran$ ./BabelStream.amd.OpenMPTarget -s $((1024*1024*512))
BabelStream Fortran
Version:  4.0
Implementation: OpenMPTarget
Running kernels 100 times
Precision: REAL64
Array size:    4295.0MB
Total size:   12884.9MB
Function    MBytes/sec  Min (sec)   Max         Average
Copy        749819.709  0.01146     0.01165     0.01143
Mul         748323.846  0.01148     0.01168     0.01145
Add         750077.243  0.01718     0.01740     0.01711
Triad       752534.861  0.01712     0.01728     0.01704
Dot         739746.348  0.01161     0.01275     0.01194

The C++ version fails to validate:

jehammond@mi100:~/BabelStream/build-amd$ ./omp-stream -s $((1024*1024*512))
BabelStream
Version: 4.0
Implementation: OpenMP
Running kernels 100 times
Precision: double
Array size: 4295.0 MB (=4.3 GB)
Total size: 12884.9 MB (=12.9 GB)
Validation failed on a[]. Average error 0.00168703
Validation failed on c[]. Average error 0.00246025
Function    MBytes/sec  Min (sec)   Max         Average
Copy        733005.563  0.01172     0.01209     0.01191
Mul         732224.338  0.01173     0.01198     0.01184
Add         809417.552  0.01592     0.01610     0.01602
Triad       827253.401  0.01558     0.01596     0.01576
Dot         519368.015  0.01654     0.01674     0.01662

This is how I compiled the C++ version:

git clean -dfx ; cmake .. -DMODEL=omp -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCXX_EXTRA_FLAGS="-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -DOMP_TARGET_GPU=1 -O3"  -DOFFLOAD=1 && make -j32

These are the same flags that I used for Fortran.

jeffhammond commented 2 years ago

Ah, sorry this is a known issue. I forgot to look if it uses int.