ROCm / rocFFT

Next generation FFT implementation for ROCm
https://rocm.docs.amd.com/projects/rocFFT/en/latest/
Other
175 stars 84 forks source link

rtc: use gridDim.x for everything in real-complex kernels #471

Closed af-ayala closed 7 months ago

af-ayala commented 7 months ago

Fixes launch errors on very large odd-length real-complex transforms, since gridDim.y and gridDim.z are limited to 64k.

ROCmMathLibrariesBot commented 7 months ago

Performance reports: Commit hashes: 313bc20902c0772b5e3417c3f1315e5071044ee8 gfx90a-perf single report gfx90a-perf double report