Closed xgarnaud closed 4 years ago
I'll have a look at the matrix. Meanwhile, #144 could be of help to you.
Thank you. Contrary to that case, here the diagonal entries are non zero due to the stabilization terms.
Based on #144 I used
{
"solver": {
"type": "fgmres",
"M": "50",
"maxiter": "50"
},
"precond": {
"pmask_pattern": "%3:4",
"usolver": {
"solver": {
"type": "bicgstab",
"tol": "1e-9",
"maxiter": "1"
},
"precond": {
"class": "relaxation",
"type": "ilu0"
}
},
"psolver": {
"solver": {
"type": "bicgstab",
"tol": "1e-9",
"maxiter": "1"
},
"precond": {
"class": "amg",
"relax": {
"type": "ilu0"
},
"coarsening": {
"type": "smoothed_aggregation"
}
}
}
}
}
Convergence takes .4s and 36 iterations, which is much better.
Ok, so it looks like you could just use the monolithic approach here (simply setup the amg preconditioner for the whole system):
$ ./solver -A taylor_couette_mat.mm -f taylor_couette_rhs.mm
Solver
======
Type: BiCGStab
Unknowns: 15504
Memory footprint: 847.88 K
Preconditioner
==============
Number of levels: 2
Operator complexity: 1.02
Grid complexity: 1.02
Memory footprint: 14.15 M
level unknowns nonzeros memory
---------------------------------------------
0 15504 765504 13.86 M (98.38%)
1 331 12643 294.49 K ( 1.62%)
Iterations: 38
Error: 9.71585e-09
[Profile: 1.063 s] (100.00%)
[ reading: 0.833 s] ( 78.38%)
[ setup: 0.017 s] ( 1.60%)
[ solve: 0.212 s] ( 19.99%)
the performance is a bit better if you tell it that the system has 4x4 block structure:
$ ./solver -A taylor_couette_mat.mm -f taylor_couette_rhs.mm -p precond.coarsening.aggr.block_size=4
Solver
======
Type: BiCGStab
Unknowns: 15504
Memory footprint: 847.88 K
Preconditioner
==============
Number of levels: 2
Operator complexity: 1.03
Grid complexity: 1.03
Memory footprint: 14.78 M
level unknowns nonzeros memory
---------------------------------------------
0 15504 765504 14.31 M (97.15%)
1 420 22448 482.22 K ( 2.85%)
Iterations: 25
Error: 9.12836e-09
[Profile: 0.999 s] (100.00%)
[ reading: 0.847 s] ( 84.70%)
[ setup: 0.018 s] ( 1.77%)
[ solve: 0.135 s] ( 13.48%)
You can also use the Schur pressure correction. Here I am using the mixed-precision one:
./schurpc_mixed -A taylor_couette_mat.mm -f taylor_couette_rhs.mm -P prm.json
Solver
======
Type: FGMRES(30)
Unknowns: 15504
Memory footprint: 7.34 M
Preconditioner
==============
Schur complement (two-stage preconditioner)
Unknowns: 15504(3876)
Nonzeros: 765504
Memory: 24.92 M
[ U ]
Solver
======
Type: BiCGStab
Unknowns: 11628
Memory footprint: 317.95 K
Preconditioner
==============
Relaxation as preconditioner
Unknowns: 11628
Nonzeros: 430596
Memory: 10.27 M
[ P ]
Solver
======
Type: BiCGStab
Unknowns: 3876
Memory footprint: 105.98 K
Preconditioner
==============
Number of levels: 2
Operator complexity: 1.23
Grid complexity: 1.10
Memory footprint: 1.07 M
level unknowns nonzeros memory
---------------------------------------------
0 3876 47844 943.90 K (81.14%)
1 384 11120 154.64 K (18.86%)
Iterations: 5
Error: 7.67882e-09
[Profile: 1.103 s] (100.00%)
[ reading: 0.901 s] ( 81.71%)
[ schur_complement: 0.201 s] ( 18.26%)
[ setup: 0.025 s] ( 2.30%)
[ solve: 0.176 s] ( 15.93%)
Again, the performance may be improved if you use 3x3 block values for the U subsystem:
./schurpc_mixed -A taylor_couette_mat.mm -f taylor_couette_rhs.mm -P prm.json --ub 3
Solver
======
Type: FGMRES(30)
Unknowns: 15504
Memory footprint: 7.34 M
Preconditioner
==============
Schur complement (two-stage preconditioner)
Unknowns: 15504(3876)
Nonzeros: 765504
Memory: 18.81 M
[ U ]
Solver
======
Type: BiCGStab
Unknowns: 3876
Memory footprint: 317.95 K
Preconditioner
==============
Relaxation as preconditioner
Unknowns: 3876
Nonzeros: 47844
Memory: 4.15 M
[ P ]
Solver
======
Type: BiCGStab
Unknowns: 3876
Memory footprint: 105.98 K
Preconditioner
==============
Number of levels: 2
Operator complexity: 1.23
Grid complexity: 1.10
Memory footprint: 1.07 M
level unknowns nonzeros memory
---------------------------------------------
0 3876 47844 943.90 K (81.14%)
1 384 11120 154.64 K (18.86%)
Iterations: 5
Error: 7.67883e-09
[Profile: 0.963 s] (100.00%)
[ reading: 0.873 s] ( 90.73%)
[ schur_complement: 0.089 s] ( 9.25%)
[ setup: 0.011 s] ( 1.19%)
[ solve: 0.077 s] ( 8.03%)
The important numbers to look at in all the tests above are setup
and solve
times. Looks like the last test has the best performance with 0.011s for setup and 0.077s for solve (or 0.089s total solution time).
Forgot to attach the prm.json
with the SchurPC parameters:
{
"solver": {
"type": "fgmres"
},
"precond": {
"pmask_size": 15504,
"pmask_pattern": "%3:4",
"approx_schur": false,
"adjust_p": 1,
"psolver": {
"solver": {
"type": "bicgstab",
"maxiter": 8,
"tol": 1e-1
}
},
"usolver": {
"solver": {
"type": "bicgstab",
"maxiter": 2,
"tol": 1e-1
},
"precond": {
"class": "relaxation",
"type": "ilu0"
}
}
}
}
Contrary to that case, here the diagonal entries are non zero due to the stabilization terms.
(#144 eventually also converged to non-zero pressure diagonal)
Just to note, that the solution time of the monolithic solver is better with ilu0 relaxation, but the setup time suffers, so it is basically a draw with the second test above:
$ ./solver -A taylor_couette_mat.mm -f taylor_couette_rhs.mm \
-p precond.coarsening.aggr.block_size=4 precond.relax.type=ilu0
Solver
======
Type: BiCGStab
Unknowns: 15504
Memory footprint: 847.88 K
Preconditioner
==============
Number of levels: 2
Operator complexity: 1.03
Grid complexity: 1.03
Memory footprint: 26.77 M
level unknowns nonzeros memory
---------------------------------------------
0 15504 765504 26.30 M (97.15%)
1 420 22448 482.22 K ( 2.85%)
Iterations: 7
Error: 8.92549e-09
[Profile: 0.965 s] (100.00%)
[ reading: 0.832 s] ( 86.24%)
[ setup: 0.051 s] ( 5.30%)
[ solve: 0.081 s] ( 8.42%)
And for the problem of this size, single level ilu0 (with 4x4 block values) seems to be the winner:
$ ./solver -A taylor_couette_mat.mm -f taylor_couette_rhs.mm -1 -b4 precond.type=ilu0
Solver
======
Type: BiCGStab
Unknowns: 3876
Memory footprint: 847.88 K
Preconditioner
==============
Relaxation as preconditioner
Unknowns: 3876
Nonzeros: 47844
Memory: 12.55 M
Iterations: 16
Error: 8.8943e-09
[Profile: 0.880 s] (100.00%)
[ reading: 0.825 s] ( 93.80%)
[ setup: 0.008 s] ( 0.88%)
[ solve: 0.046 s] ( 5.26%)
Thank you very much.
Hi,
I am trying to use the pressure projection solver for a matrix coming from a SUPG-PSPG FEM discretization of the incompressible Navier Stokes equations (3D). The DOFs are (u0, v0, w0, p0, u1, v1, w1, p1, ...).
I am struggling to find the correct setup for the solver in this case. Here are the matrix & RHS after some iterations, as the solver converges to a steady solution:
https://drive.google.com/file/d/1giG4ETr4rbc2PoJ56zWbrf71trsgYMY0/view?usp=sharing https://drive.google.com/file/d/1LkEHaOg5_WV7672qXS8oFaQvfTJRzJua/view?usp=sharing
For this matrix & RHS, Eigen's BiCGStab/ILU converges in 12 iterations and 0.8s while AMCGL takes 15 iterations and 7s. I must be doing something wrong! I tried using the settings from https://github.com/ddemidov/amgcl_benchmarks/blob/master/shared_mem/navier-stokes/schur.cpp (see below) which brings the time down to 2.1s but the number of iterations up to 25. I must still be doing something wrong... Could you please point me to the correct setup?
Code:
Extra parameters