barbagroup / fmm-bem-relaxed

Source code for the paper on inexact Krylov iterations with FMM BEM
10 stars 13 forks source link

Why am I getting a different performance & result of LaplaceBEM running on phantom? #1

Open tingyu66 opened 9 years ago

tingyu66 commented 9 years ago

I am trying to use the code to reproduce the speedup for Laplace 1st-kind problem. So I ran the first non-relaxed case (corresponding to the non-relaxed case on the first row of the table below) using: theta = 0.5, N = 2048 (recursions = 5) , ncrit = 400 screen shot 2015-06-24 at 4 19 01 pm In the terminal, I first ran (without preconditioning) ./LaplaceBEM -theta 0.5 -ncrit 400 -p 8 -recursions 5 -fixed_p -gmres -solver_tol 1e-5 and I got the result like this:

initialised 2048 triangles
...
Solver: GMRES
Preconditioner: Identity
it: 001, res: 2.311e-04, fmm_req_p: 8
it: 002, res: 9.630e-05, fmm_req_p: 8
it: 003, res: 4.506e-05, fmm_req_p: 8
it: 004, res: 1.935e-05, fmm_req_p: 8
Final residual: 9.2838e-06, after 5 iterations

TIMING:
    setup : 3.6045e+00s
    solve : 1.5399e+01s
external phi: 0.192, exact: 0.19245, error: 2.3505e-03
error: 5.344e-03

The answer seemed to be correct, so then I tried to used the diagonal matrix as the preconditioner (try to match the "t_solve = 0.27" in the table): ./LaplaceBEM -theta 0.5 -ncrit 400 -p 8 -recursions 5 -fixed_p -gmres -diagonal -solver_tol 1e-5 I got this:

initialised 2048 triangles
......
Creating plan: 0.00376582
Executing plan: 3.58662
1st-kind equation being solved

TIMING:
    setup : 3.5906e+00s
    solve : 4.9171e-03s
external phi: 2.1197e-10, exact: 0.19245, error: 1.0000e+00
error: 1.000e+00

it seems that the preconditioner is not working this time. I tried the "local" preconditioned with "fgmres" solver, and got the same error. I compiled the code using the Makefile in the "examples" folder, and gave me no error on LaplaceBEM executable. And the gcc version on phantom is 4.6. Would you @slayton58 help me out on how to use the diagonal preconditioner correctly to get a similar performance of t_solve = 0.27 in this case? Thank you.

Tingyu

slayton58 commented 9 years ago

Has the hostname changed at all for phantom (from phantom.seas.gwu.edu)? I can no longer see the machine..

tingyu66 commented 9 years ago

No, but there is a power outage today at our building. The machine is down currently...

slayton58 commented 9 years ago

That'll do it.. Let me know when the machine is back up and I'll take a look

tingyu66 commented 9 years ago

Sure, thank you.

On Jun 24, 2015, at 5:11 PM, slayton58 notifications@github.com wrote:

That'll do it.. Let me know when the machine is back up and I'll take a look

— Reply to this email directly or view it on GitHub.

tingyu66 commented 9 years ago

Hi Simon, the machine is back now.

slayton58 commented 9 years ago

Comment line 147 in include/executor/ExecutorSingleTree.hpp to remove excess output, then add '-lazy_eval' to your command line.

You can also specify a number of OMP threads to use which affects runtime for me at 8k panels and above

tingyu66 commented 9 years ago

Thanks for the reply, Simon. After I use -lazy_eval flag and tweaking OMP_NUM_THREADS, I got the performance of LaplaceBEM which is very close to the data presented in the paper. I think these cases are run with straight GMRES (without a preconditioner).

Then I tried to run with -diagonal preconditioner, so I uncommented the line 215-221 and line 261-265 in "LaplaceBEM.cpp" to enable the diagonal preconditioner. But I found the "diagonal preconditioned" case always needs more iterations and time to converge than the "unpreconditioned" case. So I wonder did you get a similar observation, or am I using the diagonal preconditioner correctly? (I used -lazy_eval and tested with different problem size and accuracy).

Thank you.

labarba commented 9 years ago

@slayton58 Note that in the thesis it says (p.54, Chapter 5 on results with Laplace BEM): "All tests were performed using a canonical right-preconditioned GMRES (algorithm 1) …"

I transferred this assertion to the paper, and repeated it in the caption of each figure.

Therefore, if our suspicions are right, and the tests were done without a preconditioner, there is a mistake in the paper in how the results are reported. We must get to the bottom of this and correct the manuscript as needed.

slayton58 commented 9 years ago

The GMRES solver used requires a preconditioner -- it cannot be run without specifying one. You say "no preconditioner", I say "Identity preconditioner".

labarba commented 9 years ago

I guess what you're saying is that the statement in p.54 refers to the fact that the code implements an algorithm for pre-conditioned GMRES and the only way to not precondition is to use an identity preconditioner. But the statement is misleading to the reader because "all tests were performed with a preconditioned GMRES" will be interpreted as having run the experiments with some preconditioning applied on the linear system—the subject of the sentence is "tests." The statement in p.54 is the only time that the string "precondition" appears in Ch. 5. There is no other mention of what preconditioner was used with the Laplace tests. But I note that the table of parameters for the tests with red blood cells do say "Preconditioner … None" (p.90).

What I want is enough information to reproduce the results.