UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
313 stars 109 forks source link

Add hipstdpar support to BabelStream #195

Open gsitaram opened 2 months ago

gsitaram commented 2 months ago

This PR adds support for offload to AMD GPUs using the par_unseq execution policy in C++ standard parallelism algorithms. To trigger the GPU offload of all parallel algorithms, the --hipstdpar compilation flag must be provided. For GPU targets other than the current default of gfx906, the --offload-arch=<arch_string> option must also be provided at compile time.

When using ROCm 6.1.0, the compilation commands may look like the following if compiling for an AMD Instinct MI200 series GPU:

cmake -Bbuild -H. -DMODEL=std-data -DCMAKE_CXX_COMPILER=hipcc -DCLANG_OFFLOAD=gfx90a
cmake --build build

Remember to set the environment variable to enable address translation and page migration (where applicable) when running std-data-stream or std-indices-stream:

export HSA_XNACK=1
tomdeakin commented 1 month ago

It's great to see hipstdpar working, so let's work to get this merged in. Thanks for the contributions.

gsitaram commented 1 week ago

Hi @tomdeakin, @afanfa and I have made the changes requested. Please check and approve if everything looks okay. Thanks!