FourierFlows / FourierFlows.jl

Tools for building fast, hackable, pseudospectral partial differential equation solvers on periodic domains
https://bit.ly/FourierFlows
MIT License
206 stars 29 forks source link

Questions about ETDRK4 #31

Closed ChrisRackauckas closed 6 years ago

ChrisRackauckas commented 6 years ago

Hey, I am the developer of DifferentialEquations.jl and am looking to add an ETDRK4 method to our suite. Thank you for building this code since it is a great resource! I am looking at the portion:

https://github.com/FourierFlows/FourierFlows.jl/blob/master/src/timesteppers.jl#L298-L355

Could you explain a little bit about what's going on there? Is there a non-complex version or does it require a complex linear operator (LC is the linear operator?) for the contour integral (does it make sense for non-complex?)? Does that trivially extend to arbitrary dimensions as well?

And what are the dual and filtered versions?

ChrisRackauckas commented 6 years ago

I found this: https://github.com/navidcy/ETDRK4_notes/blob/master/ETDRK4-timestepping.pdf

So is this method only applicable to diagonal operators?

glwagner commented 6 years ago

Hi! Thank you for your code!

I'll try to break down what is happening.

  1. First of all, this code calculates the coefficients for the ETDRK4 method using a technique proposed by Kassam and Trefethen.

  2. LC is an array of "coefficients" (LC is an obscure abbreviation for "linear coefficient"). Our spectral method generates nk x nl equations (nk and nl being the number of modes in x and y respectively) which have the form

u_t = LC*u + N(u),

where N(u) is some nonlinear function in u. In general LC is actually an nk nl x nk nl matrix However, we restrict ourselves to problems that are linearly uncoupled. This means we can store LC in an nk x nl array and use elementwise array multiplication rather than matrix multiplication in the ETDRK4 algorithm.

  1. LC does not have to be complex; it is only the contour and contour integral that exist in complex space. If LC is real, the coefficients that depend on it should be real too (this might actually be a good test...). We don't worry about that in this code because the equations for each Fourier mode are always complex.

  2. There is a "Dual" version because we want to solve problems that, in physical space, might involve a mix of real-valued and complex-valued variables. With the algorithm we've cooked up this requires us to split the solution into two arrays (since one array is the product of rfft, and another is the product of fft, and thus have different dimensions). This isn't a complete answer I don't think --- let me know if you want me to elaborate!

  3. The "Filtered" version implements a filtering technique in Fourier space that helps prevent numerical instability for nonlinear problems. The only difference, really, is that the solution is multiplied by a "filtering" array after each time-step. This filter leaves low wavenumber content alone (is equal to 1 at low wavenumbers) and destroys high wavenumber content, similar to how viscosity destroys high wavenumber content in the Navier-Stokes equations. The code may not be written in the most intelligent way, but the gist is that a filtering operation must be inserted into the stepforward! function at the end of the time-stepper.

Comments and suggestions are definitely welcome!! There are almost certainly better, more efficient, and more Julian ways to write our code.

glwagner commented 6 years ago

Our code only works for diagonal problems. However, the algorithm defined by Kassam and Trefethen is valid for non-diagonal systems. In that case (in some sense) each "1/LC" becomes a matrix inverse instead, and exp(LC) becomes the matrix exponential.

For large problems, it may not make sense to implement ETDRK4 when LC is dense. It might make sense when LC is block diagonal. Efficiency probably requires using sparse arrays in that case. We haven't explored implementing ETDRK4 for block-diagonal LC's but I do think it would be interesting and useful to work on some time in the future.

ChrisRackauckas commented 6 years ago

Our code only works for diagonal problems. However, the algorithm defined by Kassam and Trefethen is valid for non-diagonal systems. In that case (in some sense) each "1/LC" becomes a matrix inverse instead, and exp(LC) becomes the matrix exponential.

Thanks! Yes, what I am doing is allowing arbitrary operator types, and then creating two versions of these methods, one which caches the dense matrix exponential and another that uses lazy expmv! methods. The former is probably only useful for diagonal and small PDEs, but I think it's still worthy and a good step to do first. Then the latter is good for more general PDEs. We have the former together for lower time step methods and the components to build the later pretty easily now, but I didn't know how to implement the higher order methods well but now I found the right resources 👍 .

For large problems, it's may not make sense to implement ETDRK4 when LC is dense. It might make sense when LC is block diagonal. Efficiency probably requires using sparse arrays in that case. We haven't explored implementing ETDRK4 for block-diagonal LC's but I do think it would be interesting and useful to work on some time in the future.

ApproxFun has special matrix types for getting these kinds of operators out in a way that operations are overloaded so that way they are efficient. It'll come in handy here to make the generic code work well!

glwagner commented 6 years ago

I agree wholeheartedly that its worthy to have a general ETDRK4 algorithm --- there's no shortage of coupled small systems of ODEs!

It certainly does makes sense to use a more general special type that still generates efficient code... that's a great suggestion that I hadn't thought of.

ChrisRackauckas commented 6 years ago

Looking at K&T's argument, I'm curious whether a multiprecision algorithm is a better choice in Julia. We can upconvert everything to bigfloats, perform the computation once, then store the resulting coefficients in Float64. That saves the 32-64 matrix inversions, but then uses Bigfloat algebra. Hmm...

glwagner commented 6 years ago

I don't know much about BigFloat computations and speed, but that's definitely a fascinating idea.

glwagner commented 6 years ago

Simply adding a new column to K&T's Table 2.2 for a BigFloat algorithm would be interesting to see.

ChrisRackauckas commented 6 years ago
bigz = big.(z)
big_out = @. bigz^(-3)*(-4 - 3bigz - bigz^2 + exp(bigz)*(4-bigz))

julia> big_out
11-element Array{BigFloat,1}:
 -1.3229279476884029910174740387170546619812107571134068644
5412558131351216911424e+02
  1.5484548537713570608086241405798749327174128109987872490
09028831722298910606636e-01
  1.6658049502573676565079536354858215041119884064208956518
9675612103594268925142e-01
  1.6666583054959324017300754309635466564490690888014874319
72877133822958457044721e-01
  1.6666665833055496021823984082252460930778070842037153278
66509293674525686036099e-01
  1.6666666658333055549603074596711978010938309153277373055
9482879582763455436954e-01
  1.6666666666583333055554960331068914849679340323885868687
44236925385903521643577e-01
  1.6666666666665833333055555496107155962443173731491549934
31991396068669757578724e-01
  1.6666666666666658333333055555550357372677913389484417364
89082516252499623168832e-01
  1.6666666666666666583333333055555551473223970201850601868
92837240368843288107459e-01
  1.6666666666666666665833333333055555451693379264488790335
39723875971633191734697e-01

It's definitely numerically stable. Since it doesn't require a choice of contours, that makes it much easier to design as a library function for arbitrary problems as well.

ChrisRackauckas commented 6 years ago

I translated the MATLAB code:

function cheb(N)
    N==0 && return (0,1)
    x = cos.(pi*(0:N)/N)
    c = [2; ones(N-1,1); 2].*(-1).^(0:N)
    X = repmat(x,1,N+1)
    dX = X-X'
    D  = (c*(1./c)')./(dX+(eye(N+1)))      # off-diagonal entries
    D  = D - Diagonal(sum(D,2)[:])                 # diagonal entries
    D,x
end

N = 20
D,x = cheb(N)
x = x[2:N]
w = .53*x + .47*sin.(-1.5*pi*x) - x # use w = u-x to make BCs homogeneous
u = [1;w+x;-1]
h=1/4

function contour_integral(h,M=32)
    r = 15*exp.(1im*pi*((1:M)-.5)/M) # points along complex circle
    L = D^2; L = .01*L[2:N,2:N] # 2nd-order differentiation
    A = h*L
    E = expm(A); E2 = expm(A/2);
    I = eye(N-1); Z = zeros(N-1,N-1)
    f1 = Z; f2 = Z; f3 = Z; Q = Z

    for j = 1:M
        z = r[j];
        zIA = inv(z*I-A);
        Q = Q + h*zIA*(exp(z/2)-1);
        f1 = f1 + h*zIA*(-4-z+exp(z)*(4-3*z+z^2))/z^2;
        f2 = f2 + h*zIA*(2+z+exp(z)*(z-2))/z^2;
        f3 = f3 + h*zIA*(-4-3*z-z^2+exp(z)*(4-z))/z^2;
    end
    E,E2,real(f1/M),real(f2/M),real(f3/M),real(Q/M)
end

E,E2,f1,f2,f3,Q = contour_integral(h,64)

and that gives the same output as MATLAB. However, the direct calculation gives something quite different.

function multiprecision(h)
    L = D^2; L = .01*L[2:N,2:N] # 2nd-order differentiation
    A = h*L
    E = big.(expm(A)); E2 = big.(expm(A/2));
    A = big.(A); L = big.(L)
    coeff = h^(-2) * L^(-3)
    A2 = A^2
    Q = Float64.((E-I)/A)
    a = Float64.(coeff * (-4I - A + E*(4I - 3A  + A2)))
    b = Float64.(coeff * (2I + A + E*(-2I + A)))
    c = Float64.(coeff * (-4I - 3A - A2 + E*(4I-A)))
    Float64.(E),Float64.(E2),a,b,c,Q
end

E,E2,a,b,c,Q2 = multiprecision(h)

The eigenvalues aren't even small so this should definitely be okay numerically (and it's with bigfloats of course). Am I doing anything that's obviously wrong?

glwagner commented 6 years ago

I found one small error --- in the function multiprecision(h) I think Q depends on E2, not E.

Other than that though your formula for a, b, and c seem to correspond accurately to K&L's equation (2.5).

Are we sure that the computations are performed in BigFloat precision the way the code is written? Is the same output achieved when the outputs are BigFloats and only converted to Float64 afterwards?

I don't quite understand how BigFloat works. Being arbitrary precision, does that mean that calculations involving the cancellation of large numbers somehow incur more computation time, because the needed precision is higher?

ChrisRackauckas commented 6 years ago

I found one small error --- in the function multiprecision(h) I think Q depends on E2, not E.

Thanks.

Are we sure that the computations are performed in BigFloat precision the way the code is written? Is the same output achieved when the outputs are BigFloats and only converted to Float64 afterwards?

Yes.

I don't quite understand how BigFloat works. Being arbitrary precision, does that mean that calculations involving the cancellation of large numbers somehow incur more computation time, because the needed precision is higher?

Yes. The numbers have a lot more precision so things like + and / are more expensive, meaning but the cancellation errors are much much smaller. I ran this with setprecision(BigFloat,2000) which does not seem to change then numbers, so it's not an error needing even higher precision. The eigenvalues are pretty large anyways so Float64 should even be good enough. I wonder what's going on...

ChrisRackauckas commented 6 years ago

@glwagner thanks for the guidance. I found out what was going on and summarized it here:

https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/issues/242

Turns out K&T's calculation is quite imprecise, and the contour integral is pretty unstable as well. In addition, the multiprecision method is much faster! I'm going to make a nice blog post out of this.

navidcy commented 6 years ago

@ChrisRackauckas I am puzzled with the following regarding getting the ETD coefficients using BigFloat:

Here: https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/issues/242 and during the presentation you gave at https://www.meetup.com/Southern-California-Julia-Users/events/247130330/ you were computing the exponential of the Float64 matrix L and then convert to BigFloats, i.e.

  expL  = big.(exp.(L))
  expL2 = big.(exp.(L/2))
  L = big.(L)

I did a comparison using the above and also doing what I find to be more intuitively correct, i.e. first transforming L to BigFloat and then computing the exponentials:

  L = big.(L)
  expL  = exp.(L)
  expL2 = exp.(L/2)

The results are quite different.

Here's a simple script in which L = -ν (kx^2+ky^2)^3 + i k_x:

# create a grid (without calling FourierFlows)
n, L = 128, 2π
nk = n
nkr = Int(n/2+1)

i1 = 0:Int(n/2)
i2 = Int(-n/2+1):-1

k  = reshape(2π/L * cat(1, i1, i2), (nk, 1))
kr = reshape(2π/L * cat(1, i1), (nkr, 1))
l  = k

Kr = [ kr[i] for i = 1:nkr, j = 1:nk]
Lr = [ l[j]  for i = 1:nkr, j = 1:nk]

KKrsq = Kr.^2 + Lr.^2

# create a test-LC matrix
ν = 1e-3
LC = -ν*KKrsq.^3  + im*Kr

dt = 0.01       # the timestep

# ETD coefficients with contour integration
function getetdcoeffs(dt, LC; ncirc=32, rcirc=1)
  shape = Tuple(cat(1, ncirc, ones(Int, ndims(LC))))

  circ = zeros(Complex{Float64}, shape)
  circ .= rcirc * exp.(2π*im/ncirc*(0.5:1:(ncirc-0.5)))
  circ = permutedims(circ, ndims(circ):-1:1)

  zc = dt*LC .+ circ
  M = ndims(LC)+1

  # Four coefficients: ζ, α, β, Γ
  ζc = @.          ( exp(zc/2)-1 ) / zc
  αc = @. ( -4 - zc + exp(zc)*(4 - 3zc + zc^2) ) / zc^3
  βc = @.    ( 2  + zc + exp(zc)*(-2 + zc) ) / zc^3
  Γc = @. ( -4 - 3zc - zc^2 + exp(zc)*(4 - zc) ) / zc^3

  ζ = dt*squeeze(mean(ζc, M), M)
  α = dt*squeeze(mean(αc, M), M)
  β = dt*squeeze(mean(βc, M), M)
  Γ = dt*squeeze(mean(Γc, M), M)

  ζ, α, β, Γ
end

# ETD coefficients using Big Floats
function getetdcoeffsBigFloat(dt, LC)
  L = dt*LC
  M = ndims(LC)+1

  L = big.(L)
  expL  = exp.(L)
  expL2 = exp.(L/2)

  ζ = @. dt*Complex{Float64}( ( expL2-1 ) / L )
  α = @. dt*Complex{Float64}( (-4 - L + expL*(4-3L + L^2))/L^3 )
  β = @. dt*Complex{Float64}( ( 2 + L + expL*(-2 + L))/L^3 )
  Γ = @. dt*Complex{Float64}( (-4 - 3L -L^2 + expL*(4 - L))/L^3 )

  if isnan(ζ[1, 1]); ζ[1, 1] = dt/2; end
  if isnan(α[1, 1]); α[1, 1] = dt/6; end
  if isnan(β[1, 1]); β[1, 1] = dt/6; end
  if isnan(Γ[1, 1]); Γ[1, 1] = dt/6; end

  ζ, α, β, Γ
end

# ETD coefficients using Big Floats, a la Chris
function getetdcoeffsBigFloat_a_la_chris(dt, LC)
  L = dt*LC
  M = ndims(LC)+1

  expL  = big.(exp.(L))
  expL2 = big.(exp.(L/2))
  L = big.(L)

  ζ = @. dt*Complex{Float64}( ( expL2-1 ) / L )
  α = @. dt*Complex{Float64}( (-4 - L + expL*(4-3L + L^2))/L^3 )
  β = @. dt*Complex{Float64}( ( 2 + L + expL*(-2 + L))/L^3 )
  Γ = @. dt*Complex{Float64}( (-4 - 3L -L^2 + expL*(4 - L))/L^3 )

  if isnan(ζ[1, 1]); ζ[1, 1] = dt/2; end
  if isnan(α[1, 1]); α[1, 1] = dt/6; end
  if isnan(β[1, 1]); β[1, 1] = dt/6; end
  if isnan(Γ[1, 1]); Γ[1, 1] = dt/6; end

  ζ, α, β, Γ
end

@time ζ1, α1, β1, Γ1 = getetdcoeffs(dt, LC, ncirc=32, rcirc=1)
@time ζ2, α2, β2, Γ2 = getetdcoeffsBigFloat(dt, LC)
@time ζ3, α3, β3, Γ3 = getetdcoeffsBigFloat_a_la_chris(dt, LC)

println(" ")
println("compare contour integration and getetdcoeffsBigFloat")
println("diff error ζ: ", Float64.(norm(ζ1-ζ2, Inf)))
println("diff error α: ", Float64.(norm(α1-α2, Inf)))
println("diff error β: ", Float64.(norm(β1-β2, Inf)))
println("diff error Γ: ", Float64.(norm(Γ1-Γ2, Inf)))

println(" ")
println("compare getetdcoeffsBigFloat and getetdcoeffsBigFloat_a_la_chris")
println("diff error ζ: ", Float64.(norm(ζ3-ζ2, Inf)))
println("diff error α: ", Float64.(norm(α3-α2, Inf)))
println("diff error β: ", Float64.(norm(β3-β2, Inf)))
println("diff error Γ: ", Float64.(norm(Γ3-Γ2, Inf)))

and its output

compare contour integration and getetdcoeffsBigFloat
diff error ζ: 1.2348486740795625e-17
diff error α: 3.815299397420312e-16
diff error β: 2.043783451078829e-16
diff error Γ: 8.422537622952059e-16

compare getetdcoeffsBigFloat and getetdcoeffsBigFloat_a_la_chris
diff error ζ: 2.306922014527757e-14
diff error α: 0.00045843911787690907
diff error β: 0.00022921898468334615
diff error Γ: 0.00045843682086963745

Any comments/input on that?

ChrisRackauckas commented 6 years ago

For the diagonal case, yes do

  L = big.(L)
  expL  = exp.(L)
  expL2 = exp.(L/2)

(and fix the 0's via the limit like you did). For the non-diagonal case, notice:

julia> expm(big.(rand(4,4)))
ERROR: MethodError: no method matching expm(::Array{BigFloat,2})

That's the main reason why I didn't do a bigfloat matrix exponential: the algorithm doesn't exist. In the comment (https://github.com/JuliaDiffEq/OrdinaryDiffEq.jl/issues/242#issuecomment-356746021) I noted that for the problem I was testing (the one from the ETDRK4 paper) the Float64 expm was correct to 1e-15, so that was fine. I don't think the expm algorithm has amazing error bounds either (http://epubs.siam.org/doi/abs/10.1137/04061101X) so there are probably still some bad cases until Julia gets a generic expm that works with BigFloats, and when it does then definitely it should be used.

P.S. I haven't gotten the slides up because I want to clean it up for a blog post.