mpip / pfft

Parallel fast Fourier transforms
GNU General Public License v3.0
54 stars 23 forks source link

local_start of 'empty' ranks is inconsistent. #22

Open rainwoodman opened 8 years ago

rainwoodman commented 8 years ago

The local_start of an empty rank is always set to 0. This is causes unnecessary branching in downstream code. The logical model is simpler if we just think of these 'stencils' as with a size of zero, but offsetted the same way as others.

For example the local_i_start of a 3d r2c transform on a 2x53 domain decomposition(this set-up is sub-optimal) is currently:

([   0,  512]),
([   0,   20,   40,   60,   80,  100,  120,  140,  160,  180,  200,
        220,  240,  260,  280,  300,  320,  340,  360,  380,  400,  420,
        440,  460,  480,  500,  520,  540,  560,  580,  600,  620,  640,
        660,  680,  700,  720,  740,  760,  780,  800,  820,  840,  860,
        880,  900,  920,  940,  960,  980, 1000, 1020,    0]

I would suggest to change the last 0 to 1020.

rainwoodman commented 8 years ago

Sorry, I meant 1024 not 1020.

mpip commented 8 years ago

This is the way FFTW did it for years. But I do not have any problems with your suggestions. I'll give it a try in branch block_offset. It will be merged into master after some tests.

rainwoodman commented 8 years ago

Thanks!