UoB-HPC / BabelStream

STREAM, for lots of devices written in many programming models
Other
313 stars 109 forks source link

Thrust Implementation #111

Closed tom91136 closed 2 years ago

tom91136 commented 2 years ago

This PR adds the Thrust implementation. The build script handles both Thrust proper (Nvidia's implementation) and rocThrust (AMD's implementation).

As Thrust doesn't cover device selection/synchronisation related APIs, this PR makes use of a small amount of macros for the respective HIP/CUDA calls.

The implementation has been tested on ROCm 4.5.0 on Radeon VII and CUDA 11.3 on Titan X (Pascal) with comparable performance to native HIP and CUDA, respectively.

Finally, this PR also includes updates to the CI for the latest versions of ROCm and CUDA.