vetter / shoc

The SHOC Benchmark Suite
Other
247 stars 102 forks source link

CUDA 4.1 FFT 50% Performance Drop #1

Closed kspaff closed 10 years ago

kspaff commented 12 years ago

With the new LLVM compiler backend, CUDA FFT performance dropped by 50% on Keeneland. OpenCL performance stayed the same.

I suspect this might be due to loops being unrolled differently (the unroll option that used to go to the old compiler is now ignored).

kspaff commented 10 years ago

Fixed some time ago, FFT can now be built to use cufft