Closed jeffhammond closed 2 years ago
I am using ROCM 5 on MI-100.
Fortran looks great with 64-bit integer indexing:
jehammond@mi100:~/BabelStream/src/fortran$ ./BabelStream.amd.OpenMPTarget -s $((1024*1024*512)) BabelStream Fortran Version: 4.0 Implementation: OpenMPTarget Running kernels 100 times Precision: REAL64 Array size: 4295.0MB Total size: 12884.9MB Function MBytes/sec Min (sec) Max Average Copy 749819.709 0.01146 0.01165 0.01143 Mul 748323.846 0.01148 0.01168 0.01145 Add 750077.243 0.01718 0.01740 0.01711 Triad 752534.861 0.01712 0.01728 0.01704 Dot 739746.348 0.01161 0.01275 0.01194
The C++ version fails to validate:
jehammond@mi100:~/BabelStream/build-amd$ ./omp-stream -s $((1024*1024*512)) BabelStream Version: 4.0 Implementation: OpenMP Running kernels 100 times Precision: double Array size: 4295.0 MB (=4.3 GB) Total size: 12884.9 MB (=12.9 GB) Validation failed on a[]. Average error 0.00168703 Validation failed on c[]. Average error 0.00246025 Function MBytes/sec Min (sec) Max Average Copy 733005.563 0.01172 0.01209 0.01191 Mul 732224.338 0.01173 0.01198 0.01184 Add 809417.552 0.01592 0.01610 0.01602 Triad 827253.401 0.01558 0.01596 0.01576 Dot 519368.015 0.01654 0.01674 0.01662
This is how I compiled the C++ version:
git clean -dfx ; cmake .. -DMODEL=omp -DCMAKE_CXX_COMPILER=/opt/rocm/llvm/bin/clang++ -DCXX_EXTRA_FLAGS="-fopenmp -fopenmp-targets=amdgcn-amd-amdhsa -Xopenmp-target=amdgcn-amd-amdhsa -march=gfx908 -DOMP_TARGET_GPU=1 -O3" -DOFFLOAD=1 && make -j32
These are the same flags that I used for Fortran.
Ah, sorry this is a known issue. I forgot to look if it uses int.
int
I am using ROCM 5 on MI-100.
Fortran looks great with 64-bit integer indexing:
The C++ version fails to validate:
This is how I compiled the C++ version:
These are the same flags that I used for Fortran.