Closed francescoboc closed 5 years ago
The c2r transform assumes that the data is equivalent to "half" of an array with conjugate-symmetry, which corresponds to the DFT of real inputs. Since the data is actually slightly over half of the "logical" conjugate-symmetric array, this means that it must contain some redundancy.
If you give the c2r transform "random" complex inputs that don't have the requisite symmetry/redundancy, then the the input gets projected into the requisite subspace and you lose some information.
Moral: only use c2r transforms if the inputs are the output of an r2c transform or equivalent.
@stevengj Yes, I understand, but I think that the problem is hidden in that "or equivalent". What does it mean exactly? Because in my case I was not giving at all "random" complex inputs to c2r
, I was giving an array obtained from a r2c
transform, on which I did some operations (of course I was doing operations, what would be the point of doing r2c
and then straight away c2r
to obtain the identity operation?).
Please, have a look at my Stackoverflow post, and you will see what I mean. The code is now written using fftw_plan_dft_2d
, and not the c2r
, r2c
anymore, but you can easily imagine how it was before (if you can't, I can rewrite the code and send it to you). And it was not working, the only way to make it work has been to consider the dh
array as complex, a thing that I couldn't have done using c2r
. It took me almost 2 months to figure it out.
I also found another SO question of 7 years ago which describes EXACTLY the same problem.
Moral: using c2r
and r2c
transforms to implement pseudospectral methods to integrate PDEs is NOT a good idea. But I insist: this should be mentioned somewhere in the guide.
You have to do operations that preserve the conjugate symmetry, i.e. that produce an array that could have been the result of an r2c transform. For example, multiplying by a real scalar, or elementwise multiplying the results of two r2c transforms for a convolution, or summing the results of two r2c transforms.
If you are using fftw_plan_dft_2d
and get real output, then you could have used c2r. If you are not getting real output (up to roundoff errors), then taking the real parts of the output is throwing away data.
It is perfectly fine to use r2c and c2r transforms to implement pseudospectral methods for PDEs involving real variables. (In fact, one such code is exactly what they were initially written for in FFTW.)
In a quick glance at the stackoverflow code you posted, it looks like the problem may be that you are taking the second derivative (multiplying by the wavevector "q^2" in Fourier space) in a way that does not preserve conjugate symmetry, and hence will not produce real outputs. In particular, when doing spectral differentiation, you often need careful handling of the "middle" elements, or the Nyquist frequency in 1d, as I review in these notes.
Update: at second glance, your q^2 is probably okay, but it's worth taking a closer look.
(In particular, your SO post says "Basically, because of stupid rounding errors, the IFT of the FT of a real function (in my case dh), is NOT purely real." Are you sure these are floating-point "rounding" errors, and not incorrect handling of the "Nyquist" elements? Try looking at these errors for a very coarse spatial grid — if they are rounding errors, they should still be on the order of 1e-13 times the norm of the array, but if you are mishandling of the Nyquist elements or similar then the erroneous imaginary parts should become much larger as you coarsen your discretization.)
@stevengj Thanks for your answer, but I don't think q^2 is the problem. I tried to remove the multiplication by q^2 and the code is still unstable (if I discard the imaginary part of dh
).
Moreover, have a look at this code:
# include <stdio.h>
# include <stdlib.h> // contains rand()
# include <math.h> // contains M_PI and pow()
# include <complex.h> // for complex notation of arrays
# include <omp.h> // OpenMP for parallelization
# include <fftw3.h> // FFTW3 for Fast Fourier Transforms
# define pi M_PI
int main(){
// define grid size and spacing
int Nx = 200; // n of points on x
int Ny = 200; // n of points on y
double dx = 0.5; // bin size on x and y
// define simulation time and time step
long int Nt = 1 *pow(10,4); // no. of time steps
double dt = 0.5; // time step size
// total number of frames to plot (at denominator)
int nframes = Nt/100;
// define the noise for the initial condition
double rn, drift = 0.05; // punctual drift of h(x)
srand(666); // seed the RNG
int nthreads = 10; // no. of threads used in the parallelization
// other variables
int i, j, nt; // variables for space and time loops
// set the parameters for FFTW3 and OpenMP parallelization
fftw_init_threads(); // initialize the threads for FFTW3
omp_set_num_threads(nthreads); // set the no. of threads for OpenMP
fftw_plan_with_nthreads(omp_get_max_threads()); // passing the no. of OMP threads to FFTW3
// declare and allocate memory for real variables
double *Linft = fftw_alloc_real(Nx*Ny);
double *Q2 = fftw_alloc_real(Nx*Ny);
double *qx = fftw_alloc_real(Nx);
double *qy = fftw_alloc_real(Ny);
// declare and allocate memory for complex variables
fftw_complex *dh = fftw_alloc_complex(Nx*Ny);
fftw_complex *dhft = fftw_alloc_complex(Nx*Ny);
fftw_complex *Nonl = fftw_alloc_complex(Nx*Ny);
fftw_complex *Nonlft = fftw_alloc_complex(Nx*Ny);
// create the FFTW plans
fftw_plan FT_dh_dhft = fftw_plan_dft_2d ( Nx, Ny, dh, dhft, FFTW_FORWARD, FFTW_ESTIMATE );
fftw_plan FT_Nonl_Nonlft = fftw_plan_dft_2d ( Nx, Ny, Nonl, Nonlft, FFTW_FORWARD, FFTW_ESTIMATE );
fftw_plan IFT_dhft_dh = fftw_plan_dft_2d ( Nx, Ny, dhft, dh, FFTW_BACKWARD, FFTW_ESTIMATE );
// open file to store the data
char acstr[160];
FILE *fp;
sprintf(acstr, "CH2d_IE_dt%.2f_dx%.3f_Nt%ld_Nx%d_Ny%d_#f%d.dat",dt,dx,Nt,Nx,Ny,nframes);
// generate h(x,y) at initial time (random noise)
for ( i = 0; i < Nx*Ny; i++ ) {
rn = (double) rand()/RAND_MAX; // extract a random number between 0 and 1
dh[i] = drift-2.0*drift*rn; // shift of +-drift
}
// execute plan for the first time
fftw_execute (FT_dh_dhft);
// calculate wavevectors with FFTW ordering and simmetry
for (i = 0; i < Nx; i++) { qx[i] = 2.0*i*pi/(Nx*dx); }
for (i = 0; i < Ny; i++) { qy[i] = 2.0*i*pi/(Ny*dx); }
for (i = 1; i < Nx/2; i++) { qx[Nx-i] = -qx[i]; }
for (i = 1; i < Ny/2; i++) { qy[Ny-i] = -qy[i]; }
// calculate q^2 and the FT of the linear operator
for ( i = 0; i < Nx; i++ ) {
for ( j = 0; j < Ny; j++ ) {
Q2[i*Ny+j] = qx[i]*qx[i] + qy[j]*qy[j];
Linft[i*Ny+j] = -Q2[i*Ny+j]*Q2[i*Ny+j];
}
}
// TIME LOOP
for(nt = 0; nt <= Nt; nt++) {
if((nt % nframes)== 0) {
printf("%.0f%%\r",((double)nt/(double)Nt)*100);
fflush(stdout);
// write data to file
fp = fopen(acstr,"a");
for ( i = 0; i < Nx; i++ ) {
for ( j = 0; j < Ny; j++ ) {
fprintf(fp, "%.6f ", dh[i*Ny+j]);
}
fprintf(fp, "\n");
}
fclose(fp);
// check if h is NaN
if (isnan(dh[1])!=0) {
printf("crashed!\n");
return 0;
}
}
// calculate h^3 in direct space
#pragma omp parallel for
for ( i = 0; i < Nx*Ny; i++ ) {
Nonl[i] = dh[i]*dh[i]*dh[i];
}
// Fourier transform of h^3 (Nonl only contains h^3 now)
fftw_execute (FT_Nonl_Nonlft);
// calculate Nonlinear term (dxxh^3 - dxxh) in Fourier space
// NB second derivative in Fourier space is just multiplication by -q^2
#pragma omp parallel for
for ( i = 0; i < Nx*Ny; i++ ) {
Nonlft[i]= -Q2[i]*(Nonlft[i]-dhft[i]);
}
// Implicit Euler scheme in Fourier space
#pragma omp parallel for
for ( i = 0; i < Nx*Ny; i++ ) {
dhft[i] = (dhft[i] + dt*Nonlft[i])/(1.0 - dt*Linft[i]);
}
// transform h back in direct space
fftw_execute (IFT_dhft_dh);
// normalize
#pragma omp parallel for
for ( i = 0; i < Nx*Ny; i++ ) {
dh[i] = dh[i] / (double) (Nx*Ny);
}
}
// termiate the FFTW3 plan and clean the memory
fftw_destroy_plan(FT_dh_dhft);
fftw_destroy_plan(FT_Nonl_Nonlft);
fftw_destroy_plan(IFT_dhft_dh);
fftw_cleanup();
fftw_cleanup_threads();
fftw_free(dh);
fftw_free(Nonl);
fftw_free(qx);
fftw_free(qy);
fftw_free(Q2);
fftw_free(Linft);
fftw_free(dhft);
fftw_free(Nonlft);
printf("end!\n");
return 0;
}
That's basically the same code that I posted on SO, but it's parallelized with OMP and uses complex.h
. The dh
array should be real, and indeed if we inspect the output file we see that it contains no imaginary part: -0.016216 0.041131 -0.004177 -0.016961 -0.003971 0.045668 0.008690 ...
and I can also plot it directly using the imgshow
function of matplotlib/python. Great!
But if I change the line in which I normalize dh
from this:
dh[i] = dh[i] / (double) (Nx*Ny);
to this:
dh[i] = creal(dh[i]) / (double) (Nx*Ny);
than the code is unstable again! (And it happens the same if I change the transform functions to r2c
and c2r
).
Is it there something that I am missing?
dh
array should be real, and indeed if we inspect the output file we see that it contains no imaginary part:-0.016216 0.041131 -0.004177 -0.016961 -0.003971 0.045668 0.008690 ...
and I can also plot it directly using the imgshow function of matplotlib/python. Great!
Clearly it is not purely real if creal(dh[i])
changes the answers compared to dh[i]
; printing only 5 digits hides the magnitude of the imaginary parts. The question is, how big are the imaginary parts? Try computing (maximum abs(cimag)) / (maximum abs(creal))
of your array for a small value of Nx
and Ny
(say 6).
@stevengj Yea, you are right, sorry. I thought that fprintf
worked differently for complex arrays (it's the first time that I use complex.h
).
I tried to do what you suggest, and for Nx = Ny = 6
and after Nt = 1000000
(10^6) steps, I get 0.0000000000000000000000000
. For a slightly bigger array, however (Nx = Ny = 25
, Nt
unchanged), I get 0.0000000000024354170460225
. For Nx = Ny = 100
the ratio is 0.0000000000005667251389814
(smaller). And for Nx = Ny = 200
I get 0.0000000000000660306547623
(even smaller).
What is it after 1 timestep? The magnitudes you're reporting do sound consistent with floating-point roundoff errors.
However, it sounds like the problems you are seeing have nothing to do with r2c/c2r, since you say they happen even for complex data if you discard the imaginary parts. Hence we should probably close this issue.
If that's the case — your algorithm is incredibly sensitive to inevitable tiny floating-point roundoff errors — then you may have a numerically unstable algorithm that you'll want to re-think.
After 1 timestep I get the following ratios for different Nx
,Ny
:
Nx = Ny = 6
-> 0.00000000000000000000
Nx = Ny = 25
-> 0.00248675364570205993
Nx = Ny = 100
-> 0.00000000000000025229
Nx = Ny = 200
-> 0.00000000000000028818
Well, yes, these problems happen when I discard the residual imaginary parts of the function that I am integrating, but what I am saying is that when I use r2c/c2r
transforms, then it is impossible to fix the issue, because the c2r
function does not keep these imaginary parts.
For what concerns the sensibility of my algorithm, I tried other, more sophisticated ones, like Exponential Time DIfferencing (ETD) and ETD Runge-Kutta 2nd order (ETDRK2) [https://doi.org/10.1006/jcph.2002.6995]. They were still crashing, a bit later in time since they are less "sensible", but still crashing.
I also tried to change the equation, since I thought that maybe the problem is the h^3 in the nonlinear term. I tried Swift-Hohemberg and Kuramoto-Sivashinsky, but the bug was still there.
I did not find a way to avoid the instability when using r2c/c2r
and summing recursively dhft
and a term obtained as FT(IFT(dhft
)).
If you are getting 0.00248675364570205993
then that is not a roundoff error and you have a bug (but maybe only for the odd-N case). You might also monitor the maximum imaginary part over time — it could be that at some point you have a much larger imaginary part even for the even-N case, which would help pinpoint your problem.
If you have a numerical instability in your approach, then just switching to another integrator will probably not fixed it. You have some fundamental problem in your formulation if it is sensitive to errors on the order of 2e-16
, and you need to re-think what you are doing at a more basic level. If you have a bug where sometimes you get a large imaginary part due to some symmetry-breaking in your algorithm, then you also need to find and fix that.
Not using r2c/c2r is only "fixing" your problem at a very superficial level; there is no error in these transforms.
Yes, sorry it is my fault, It was a mistake to run the program for Nx = Ny = 25
. As it is written now, my program can only handle even N's. Indeed, for even numbers, I get:
Nx=Ny=24
-> 0.00000000000000014039
Nx=Ny=26
-> 0.00000000000000011522
The unstable behaviour that I described in the previous posts happens for even N's, where there are only roundoff errors in the imaginary part of dh
.
I am not claiming that there is an error in the r2c/c2r transforms, but I think that the fact that these functions are discarding the residual imaginary part in dh
, makes a pseudospectral approach, which basically involves a sum of dhft
and FT(IFT(dhft
)), unstable after a large number of iterations.
Sorry if I insist, but my code is just a straight-forward implementation of IE/ETD/ETDRK2, so I really don't see where I could have a fundamental problem in the approach. Besides, as I already mentioned, there is at least another user reporting an analogous problem when using r2c/c2r
to implement a pseudospectral method.
I ran into an annoying problem that was coming from the fact that the
fftw_plan_dft_c2r_2d
"chops off" the residual imaginary parts from the rounding error of the IFT. To solve the problem I had to switch tofftw_plan_dft_2d
and work with complex arrays. Everything is explained in detail in this Stackoverflow question and answer that I wrote. I think that this behaviour offftw_plan_dft_c2r_2d
should be mentioned in the guide, as it can lead to problems when used recursively in a sum as I did.