Closed pgrete closed 5 years ago
Hi Philipp,
it should work. I can run step1_gpu with the following output (using 8 P100 GPU):
L1 Error is 1.24047e-09 Relative L1 Error is 3.29369e-15
Results are CORRECT!
L1 Error of iFF(a)-a: 1.34574e-09 Relative L1 Error of iFF(a)-a: 3.5732e-15
Results are CORRECT!
GPU Timing for FFT of size 512512512 Setup 1.48673 FFT 0.0720751 IFFT 0.0738944
Have you tried other configurations ? e.g.
Pierre.
On Tue, Apr 24, 2018 at 10:07 PM, Philipp Grete notifications@github.com wrote:
I'm currently performing scaling tests of the library and I get incorrect results for the provided test programs on both Pleiades and Titan when using GPUs, e.g. (on Titan)
$ aprun -n 64 ./step1 512 512 512 Input c_dim[0] * c_dims[1] != nprocs. Automatically switching to c_dims[0] = 8 , c_dims_1 = 8
Error is 5.2362e-11 Relative Error is 1.39031e-16
Results are CORRECT!
Timing for FFT of size 512512512 Setup 4.36814 FFT 0.252446 IFFT 0.279566
$ aprun -n 4 -N 1 ./step1_gpu 512 512 512 Input c_dim[0] * c_dims[1] != nprocs. Automatically switching to c_dims[0] = 2 , c_dims_1 = 2
L1 Error is 4.57168e+06 Relative L1 Error is 12.1387
L1 Error of iFF(a)-a: 4.57168e+06 Relative L1 Error of iFF(a)-a: 12.1387 GPU Timing for FFT of size 512512512 Setup 1.89571 FFT 0.382671 IFFT 1.12519
This is how I installed accfft (on Titan):
module swap PrgEnv-pgi PrgEnv-gnu module load cudatoolkit module load cray-fftw module load cmake cd src git clone https://github.com/amirgholami/accfft.git cd accfft mkdir build cd build cmake -DCMAKE_INSTALL_PREFIX=$HOME/src/accfft/build \ -DFFTW_ROOT=/opt/cray/fftw/3.3.6.2/interlagos \ -DFFTW_USE_STATIC_LIBS=true \ -DBUILD_GPU=true \ -DBUILD_STEPS=true \ -DCXX_FLAGS="-O3" \ -DBUILD_SHARED=false \ .. make
Am I doing something wrong or is there a bug/misconfiguration? Please let me know if there's any other information required.
Thanks,
Philipp
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/amirgholami/accfft/issues/15, or mute the thread https://github.com/notifications/unsubscribe-auth/AH5JR6m0O8Ty7xyX1naf6H3nA0gqcMdDks5tr4XogaJpZM4TiVP- .
I double checked with another cluster and the results are correct. However, Pierre has recently added a pull request which I just merged which might be the reason you were experiencing issues. Could you please pull the latest version and try again?
I ended up using a different library. Given that Titan is offline now and I have no access to another system with K20x I'm closing the issue.
I'm currently performing scaling tests of the library and I get incorrect results for the provided test programs on both Pleiades and Titan when using GPUs, e.g. (on Titan)
This is how I installed accfft (on Titan):
Am I doing something wrong or is there a bug/misconfiguration? Please let me know if there's any other information required.
Thanks,
Philipp