mpip / pfft

Parallel fast Fourier transforms
GNU General Public License v3.0
54 stars 23 forks source link

Strided Input and output array. #13

Closed rainwoodman closed 9 years ago

rainwoodman commented 9 years ago

The guru FFTW interface allows arbitrarily strided input and output array. PFFT does not.

This is a useful use case in a particle mesh code where the local mesh contains a 'ghost region' that is shared by other processes, but do not participate in the FFT.

mpip commented 9 years ago

FFTW-MPI interface does not support strides either. You you mean that the strides should be the same on each MPI process? This would not work with unequal data distributions. So strides must be defined for each process separately. This seams to be really complicated from the users point of view. However, we main difficulty is that the global communication of PFFT is based on the FFTW parallel transpose algorithms that do not support strides in the input arrays. This can be circumvented by using a temporary copy array with contiguous memory layout, but merely doubles the amount of memory (or triples, if the following parallel transpose works out-of-place). I tried to implement all the transpositions in a way that they can be performed in-place if memory is restricted.

mpip commented 9 years ago

In summary, the idea sounds nice but I think implementation will be difficult.

rainwoodman commented 9 years ago

Ah. I misunderstood the FFTW interface then. Let's close this.