acroy / Expokit.jl

Julia implementation of EXPOKIT routines
Other
26 stars 11 forks source link

implement phimv #18

Closed mforets closed 6 years ago

mforets commented 6 years ago

This is a working implementation of Expokit's phimv.

Note: for quick tests i use the following script:

$ julia -e 'using Expokit; include("'test_phimv.jl'"); test_phimv(500)'
function test_phimv(n)
   p = 0.1; found = false
   A = sprand(n, n, p); u = rand(n); x = similar(u)
   while !found
      try
           copy!(x, A\u)
           found = true
      catch
           A = sprand(n, n, p)
      end
   end
   vec = eye(n, 1)[:]
   w1 = phimv(1.0, A, u, vec) # warmup
   tic()
   w1 = phimv(1.0, A, u, vec)
   t1 = toc()

   w2 = expm(full(A))*(vec+x)-x # warmup
   tic()
   w2 = expm(full(A))*(vec+x)-x
   t2 = toc()
   res = norm(w1-w2)/norm(w2)
   println("res = $res")
   return res, t1, t2
end
acroy commented 6 years ago

Thanks!

Then i stepped back and wrote phimv using your notation, but still using Matlab's k1 and mb variables. Do we want to get rid of k1?

Well, it's most important that it works :-) Getting rid of k1 and mb is mostly for aesthetic reasons ... I guess one could change that later.

mforets commented 6 years ago

maybe the benchmark should also be of size 100 and 1000 (i was using the small sizes for testing).

when i pass to 1000, the real is ok, but the complex case systematically fails:

testing real n=1000 (first phimv, then expm)
elapsed time: 0.009977624 seconds
elapsed time: 0.728199252 seconds
residuum: 3.608147907075249e-14

testing complex n=1000 (first phimv, then expm)
elapsed time: 0.045600646 seconds
elapsed time: 1.33383898 seconds

residuum: 0.040071250375281954

Test Failed
  Expression: res < 1.0e-6
   Evaluated: 0.040071250375281954 < 1.0e-6
ERROR: LoadError: There was an error during testing

is this something to worry about? or is A\u inaccurate? (but then why only with complex?)

acroy commented 6 years ago

No idea to be honest. It strangely happens only for n>=1000...

ChrisRackauckas commented 6 years ago

Force it to do SVD instead to make the factorization more accurate. Or if need, use GenericSVD.jl with bigfloats to make it really accurate.

mforets commented 6 years ago

Ok, i can confirm it's a problem in my phimv, because i computed the numerical result using DifferentialEquations, which confirms that obtained through x = A\u followed by a expm(full(A))*(vec+x)-x. the test script and some results are shown below.

using DifferentialEquations
function test_phimv_complex_ODE_expm(n)
     p = 0.1
     A = sprand(n, n, p) + 1im*sprand(n, n, p)
     u = rand(n) + 1im*rand(n)
     vec = eye(n, 1)[:] + 0im*eye(n,1)[:]
     tspan = (0.0,1.0)
     prob = ODEProblem((t, x) -> A*x + u, vec, tspan)

     w1 = phimv(1.0, A, u, vec) # warmup
     tic()
     w1 = phimv(1.0, A, u, vec)
     t1 = toc()

     x = A\u
     w2 = expm(full(A))*(vec+x)-x # warmup
     tic()
     w2 = expm(full(A))*(vec+x)-x 
     t2 = toc()

     sol = solve(prob,Tsit5(),reltol=1e-12,abstol=1e-12) # warm-up
     tic()
     sol = solve(prob,Tsit5(),reltol=1e-12,abstol=1e-12)
     t3 = toc()
     w3 = sol.u[end]

     res_expm_phimv = norm(w1-w2)/norm(w2)
     println("res_expm_phimv = $res_expm_phimv")

     res_ODE_phimv = norm(w1-w3)/norm(w3)
     println("res_ODE_phimv = $res_ODE_phimv")

     res_expm_ODE = norm(w2-w3)/norm(w3)
     println("res_expm_ODE = $res_expm_ODE")

     nothing
end

results

julia> test_phimv_complex_ODE_expm(100)
elapsed time: 0.009096111 seconds
elapsed time: 0.009818277 seconds
elapsed time: 0.023226098 seconds
res_expm_phimv = 1.9888493664193742e-15
res_ODE_phimv = 2.646682762067137e-13
res_expm_ODE = 2.6319930840895537e-13

julia> test_phimv_complex_ODE_expm(500)
elapsed time: 0.015341626 seconds
elapsed time: 0.311538088 seconds
elapsed time: 2.643328402 seconds
res_expm_phimv = 2.7122145436895417e-14
res_ODE_phimv = 1.5634559851137166e-12
res_expm_ODE = 1.537820468580802e-12

julia> test_phimv_complex_ODE_expm(800)
elapsed time: 0.032651792 seconds
elapsed time: 0.702406406 seconds
elapsed time: 9.760057556 seconds
res_expm_phimv = 8.038185725639112e-15
res_ODE_phimv = 2.8499976450478502e-12
res_expm_ODE = 2.8468657057080086e-12

julia> test_phimv_complex_ODE_expm(900)
elapsed time: 0.03808314 seconds
elapsed time: 1.199792549 seconds
elapsed time: 13.138305723 seconds
res_expm_phimv = 1.560882232740971e-14
res_ODE_phimv = 2.9945922002868787e-12
res_expm_ODE = 2.9792496130607893e-12

julia> test_phimv_complex_ODE_expm(1000)
elapsed time: 0.046592795 seconds
elapsed time: 1.552531932 seconds
elapsed time: 17.860762508 seconds
res_expm_phimv = 0.0355333218535399
res_ODE_phimv = 0.03553332185147861
res_expm_ODE = 3.193342844449354e-12
mforets commented 6 years ago

then the question is, what's going on? i'm going to try it in Matlab and see if they match. i have no clue.

acroy commented 6 years ago

Seems the order of the Krylov subspace is too small. Try m=40 or larger. The question is if we could detect this insufficieny?

mforets commented 6 years ago

Seems the order of the Krylov subspace is too small. Try m=40 or larger.

well done!

The question is if we could detect this insufficieny?

quoting the original paper, "... [the dimension of the Krylov subspace] ... (usually m <= 50 while n can exceed many thousands)."

this doesn't really answer the question but indicates that it's reasonable if we set m=40 by default.

Jutho commented 6 years ago

I am confused. Are the error measures not supposed to detect that the Krylov dimension is too small, or put differently, choose a time step tau that can be correctly computed with the given Krylov subspace dimension, i.e. for time step tau=0, Krylov dimension 1 is sufficient, and for any Krylov dimension there should be a stepsize tau that can be computed within the given tolerance (though it might of course be so small that it takes forever to compute the whole evolution for a time t).

acroy commented 6 years ago

After the loop while iter <= maxiter ... end we could check if the number of iterations exceeds maxiter just like in expmv?

acroy commented 6 years ago

@Jutho : Exactly. My guess is that we exceed the allowed number of iterations/subdivisions (currently maxiter=10), which is not checked at the moment.

Jutho commented 6 years ago

One thing is the loop with maxiter=10 to determine the time step tau that is currently affordable, the other thing is the loop while tk < tf that can run for a very long time if tau is always chosen much smaller than t (because the Krylov dimension is chosen unwisely small).

mforets commented 6 years ago

i observed that typically in test_phimv_complex(1000), the loop while iter <= maxiter exits with iter = 0, since it breaks by entering the branch if err_loc <= delta * tau * tol.

on the other hand, in the expmv code that check is slightly different: it reads if err_loc <= delta * tau * (tau*tol/err_loc)^r. however, using this check doesn't seem to impact the bad outcome.

what are your thoughts?

mforets commented 6 years ago

to summarize: we've seen that for m=40 it works well with n=1000 random complex matrices, but not for m=30, where m is the Krylov subspace dimension.

to me this seems like a bug for the reasons that you have provided and because when i run the similar script and similar test in Matlab, there are no such issues for m=30. it could also be that there is some loss in precision on some built-in function when the complex numbers are used, although that would be quite strange.

i'll do further experimentation when i have more time (but that will be next week for sure).

acroy commented 6 years ago

@mforets : I think I found a mistake. Try to add fill!(hm, zero(T)) after while tk < tf. This seems to solve the issue. Nevertheless we should do some more tests with "realistic matrices", but as I said this could be a separate issue (benchmarking).

mforets commented 6 years ago

Good! -- i added it in the end of the while tk < tf loop because we do the initialization outside that loop, contrary to the Matlab version. I had removed this zeros in purpose, since in the expmv version this was removed too. Now i see that there you do a fill!(w, zero(T)) in expmv. Sorry for the confusion.

In the latest commit i also changed the initialization of hm[j+1,j]. Note that in the Matlab version, in case of "happy-breakdown", the value of hm[j+1,j] is not defined to be norm(p), because this is done after the if clause. Could you please review?

acroy commented 6 years ago

Thanks! Regarding the initialization of h[j+1,i] I am not sure which is "more correct". In the paper it is done before happy breakdown, which is the reason I implemented it this way for expmv. But you are right, that in the MATLAB version it is done after happy breakdown ... It's probably not crucial since the value is small (< btol) anyways.

After squashing the commits I think we are ready to merge!

mforets commented 6 years ago

Done. Thanks @acroy, @Jutho and @ChrisRackauckas for your help.

Two things:

mforets commented 6 years ago

On the other hand, the matrices that I have are real; it would be nice to have some complex matrices.. Quantum mechanics is a good candidate to provide some, although I don't have anything concrete in mind.

acroy commented 6 years ago

Thanks! Some cleanup might be a good idea (unifying variables, defining common constants on module level, putting common code in functions, etc). Those comments are mostly there because I can never remember the argument order of axpy et al :-) (this is actually an argument in favour of using .+= etc). PRs in that direction will always be welcome.

Regarding the benchmarks. I think putting them in a separate directory is a good idea. In addition to your example, we can also use HarwellRutherfordBoeing.jl to try other matrices. For the QM example I would use a harmonic oscillator in 2D or 3D (this was also one of the tests @ChrisRackauckas referred to earlier). We can easily generate a finite-difference Laplacian in this case. I will open an issue and we can discuss there.