FFTW / fftw3

DO NOT CHECK OUT THESE FILES FROM GITHUB UNLESS YOU KNOW WHAT YOU ARE DOING. (See below.)
GNU General Public License v2.0
2.74k stars 666 forks source link

default flags and how to use inplace #364

Open Kidd-61 opened 2 months ago

Kidd-61 commented 2 months ago

hi guys In our previous project, we were using MKL's fft, and during the migration process, we encountered two issues with FFTW.

The first issue is that when using fft's c2r multiple times, the input data is defaulted to be destroyed, and we need to use the FFTW_DESTROY_INPUT flag to make it work correctly. This behavior is not present in MKL, but we had to add this flag during the migration. When I read the documentation, I found that using the PRESERVE_FLAG may lead to NaN results in the plan. We believe that having the input destroyed by default is not safe, as we want to preserve the input data as well as ensure the plan returns valid results. What would be a more reasonable way to handle this situation?

The second issue arises when using the inplace mode; the results do not match our expectations, while they are correct with MKL. Here is a relevant portion of the code:

    fftwf_complex *a = (fftwf_complex *)malloc(4 * 4 * (4 / 2 + 1) * 2 * sizeof(float));
    for(int i = 0; i < 64; i++)
    {
            ((float *)a)[i] = i;
    }
    fftwf_plan plan1 = fftwf_plan_dft_r2c_3d(4, 4, 4, (float *)a, a, FFTW_ESTIMATE);
    for(int i = 0; i < 96; i++)
    {
            printf("%d : %f | ",i, ((float *)a)[i]);
    }
    printf("\n===========\n");
    fftwf_plan plan2 = fftwf_plan_dft_c2r_3d(4, 4, 4, a, (float *)a, FFTW_ESTIMATE);

    fftwf_execute(plan1);
    fftwf_execute(plan2);
    for(int i = 0; i < 96; i++)
    {
            printf("%d : %f | ", i, ((float *)a)[i]);
    }
    printf("\n===========\n");

// and this get worse
float *b = (float *)malloc(4 * 4 * 4 * sizeof(float));
    for(int i = 0; i < 64; i++)
    {
            b[i] = i;
    }
    fftwf_execute_dft_r2c(plan1, b, a);
    fftwf_execute_dft_c2r(plan2, a, b);
    for(int i = 0; i < 64; i++)
    {
            printf("%d : %f | ",i, b[i]);
    }
    printf("\n===========\n");
    return;

In consule, we can find results at positions 4 and 5 is not what we expect. In this case, the data processed by the FFT is not from indices 0 to 63, but rather fromy 0 ~ 3, 6 ~ 9, 12 ~ 15, ect. It appears to treat the array as a 4 4 6 cube and processes 4 elements per line. However, we need it to be treated as a linear array rather than a cube. And using another input will get worse.

Although we can handle this by using out-of-place transforms, we are curious if there is a more reasonable way to achieve the desired behavior with inplace transforms, perhaps by changing a flag or build configuration. Additionally, considering that the default setting leads to unexpected results, should the default behavior prioritize safety over efficiency?

stevengj commented 2 months ago

for(int i = 0; i < 64; i++)

You're not initializing the array correctly for an in-place transform. You're not taking the padding into account. See https://www.fftw.org/fftw3_doc/Multi_002dDimensional-DFTs-of-Real-Data.html

Kidd-61 commented 2 months ago

Thank you for reply, and thank your for the doc can explain the result of a.

However, what about b, I got -nan or some random result in this example.

stevengj commented 2 months ago

You are executing an in-place plan on out-of-place arguments.