StreamHPC / gromacs

OpenCL porting of the GROMACS molecular simulation toolkit
http://www.gromacs.org
Other
25 stars 4 forks source link

128/256 threads/block #92

Open ancahamuraru opened 9 years ago

ancahamuraru commented 9 years ago

Update the OpenCL kernel for 128/256 threads/block based on the equivalent CUDA kernel - see commit f2b9db2 from the main Gromacs master branch: https://github.com/gromacs/gromacs/commit/f2b9db2

Evaluate the performance of the new kernel for AMD and NVIDIA GPUs and decide on the final version or versions of the OpenCL kernel that will be used.

pszi1ard commented 9 years ago

As this will increase register pressure, I suggest trying 128 threads/block too. Additionally, reduction will become tricky without the lane-shuffle ops.

ancahamuraru commented 9 years ago

Thanks for the comment. It's my mistake, I forgot to mention 128 threads/block. The issue title and description are now updated.