spiral-software / spiral-package-fftx

FFTX Package
Other
2 stars 7 forks source link

Develop #52

Closed franzfranchetti closed 2 years ago

franzfranchetti commented 2 years ago

first version of PrunedMDRConv generates CUDA code, ready for verification. Updated and optimized versions to follow.

broderickpt commented 2 years ago

Rejected Changes. The code generated for MDDFT (see examples/library-cuda/mddft-cuda.g) isn't valid for the GPU. This is because the kernel splits the problem with an invalid number of threads...

void mddft3d_80x80x80(double Y, double X) { dim3 b788(4000, 1, 1), b789(4000, 1, 1), b790(4000, 1, 1), g1(16, 1, 1), g2(16, 1, 1), g3(16, 1, 1); ker_mddft3d_80x80x800<<<g1, b788>>>(X);

The threads (4000) is invalid (maximum = 1024).