I think there is, possibly, incorrect integer arithmetic in the code that
chooses loop sizes (integer overflow?)
Reproduce:
1. Compile with IBM XL ./configure
--prefix=/soft/apps/p3dfft-2.4/p3dfft-2.4-fftw-3.1.2-double --enable-ibm
--enable-stride1 --enable-1d --enable-fftw
--with-fftw=/soft/apps/fftw-3.1.2-double CC="tmpwrap mpixlc_r" CFLAGS="-O3
-qhot" FC="tmpwrap mpixlf77_r" FCFLAGS="-O3 -qhot"
2. Use specific dimensions (see below)
What is the expected output?
positive loop sizes, correct transform
What do you see instead?
Negative last loop size, the result of the transform is all zeros
What version of the product are you using? On what operating system?
Compiled on Intrepid with IBM XL
3072^3 transforms:
Using processor grid 4 x 3072 Using loop block sizes 1 5 1 5
Using processor grid 128 x 128 Using loop block sizes 1 170 1 7
Using processor grid 8 x 1024 Using loop block sizes 1 10 1 3
Using processor grid 128 x 256 Using loop block sizes 1 170 1 14
Using processor grid 16 x 1024 Using loop block sizes 1 21 1 7
Using processor grid 128 x 512 Using loop block sizes 1 170 1 -28
Using processor grid 32 x 512 Using loop block sizes 1 42 1 7
4096^3 transforms
Using processor grid 4 x 4096 Using loop block sizes 1 4 1 4
Using processor grid 64 x 512 Using loop block sizes 1 64 1 8
Using processor grid 128 x 512 Using loop block sizes 1 128 1 -16
Using processor grid 128 x 1024 Using loop block sizes 1 128 1 1
Original issue reported on code.google.com by andrey.a...@gmail.com on 10 Nov 2011 at 5:55
Original issue reported on code.google.com by
andrey.a...@gmail.com
on 10 Nov 2011 at 5:55