unroll convolution - Githubissues

usnistgov / hiperc

High Performance Computing Strategies for Boundary Value Problems

41 stars 8 forks source link

Open tkphd opened 6 years ago

tkphd commented 6 years ago

Inner loops on CUDA convolution code should run faster using a #pragma unroll statement.