Closed qzxuhui closed 3 years ago
Can you save the matrix and the RHS in the matrix market format? For example, with sparse amgcl::io::mm_write()
and dense amgcl::io::mm_write()
?
Sorry, sir. I have updated the matrix and vector file using MatrixMarket in the link. The result is shown as follow.
xuhui@xuhui-Office:~/amgcl-master/build/examples$ ./solver --matrix /home/xuhui/Temp/matrix.mtx
./solver --matrix /home/xuhui/Temp/matrix.mtx
Solver
======
Type: BiCGStab
Unknowns: 219852
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.25
Grid complexity: 1.08
Memory footprint: 209.15 M
level unknowns nonzeros memory
---------------------------------------------
0 219852 8989578 170.93 M (80.07%)
1 16467 2141353 35.60 M (19.07%)
2 777 96595 2.61 M ( 0.86%)
Iterations: 100
Error: 0.00502124
[Profile: 14.810 s] (100.00%)
[ reading: 8.414 s] ( 56.81%)
[ setup: 0.310 s] ( 2.09%)
[ solve: 6.085 s] ( 41.08%)
xuhui@xuhui-Office:~/amgcl-master/build/examples$ ./solver --matrix /home/xuhui/Temp/matrix.mtx --rhs /home/xuhui/Temp/prd_15.mtx
./solver --matrix /home/xuhui/Temp/matrix.mtx --rhs /home/xuhui/Temp/prd_15.mtx
Solver
======
Type: BiCGStab
Unknowns: 219852
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.25
Grid complexity: 1.08
Memory footprint: 209.15 M
level unknowns nonzeros memory
---------------------------------------------
0 219852 8989578 170.93 M (80.07%)
1 16467 2141353 35.60 M (19.07%)
2 777 96595 2.61 M ( 0.86%)
Iterations: 100
Error: 6.84824e-06
[Profile: 14.805 s] (100.00%)
[ reading: 8.374 s] ( 56.57%)
[ setup: 0.319 s] ( 2.15%)
[ solve: 6.110 s] ( 41.27%)
Sir, is there any rule which can help improve performance?
You matrix has block-structure with 3x3 blocks, we can exploit it by using block-valued backend (amgcl::backend::builtin<amgcl::static_matrix<double,3,3>>
). This improves things already:
./solver -A matrix.mtx -f prd_15.mtx -b3
Solver
======
Type: BiCGStab
Unknowns: 73284
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.23
Grid complexity: 1.07
Memory footprint: 146.40 M
level unknowns nonzeros memory
---------------------------------------------
0 73284 998886 124.28 M (81.41%)
1 5063 217965 19.94 M (17.76%)
2 242 10102 2.18 M ( 0.82%)
Iterations: 93
Error: 8.19976e-09
[Profile: 13.918 s] (100.00%)
[ reading: 6.983 s] ( 50.17%)
[ setup: 0.217 s] ( 1.56%)
[ solve: 6.717 s] ( 48.26%)
Further, switching to damped_jacobi
for smoothing, and estimating the matrix spectral radius to improve the quality of smoothed aggregation:
./solver -A matrix.mtx -f prd_15.mtx -b3 \
precond.relax.type=damped_jacobi \
precond.coarsening.estimate_spectral_radius=true precond.coarsening.power_iters=5
Solver
======
Type: BiCGStab
Unknowns: 73284
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.23
Grid complexity: 1.07
Memory footprint: 146.29 M
level unknowns nonzeros memory
---------------------------------------------
0 73284 998886 124.28 M (81.42%)
1 5063 217965 19.96 M (17.77%)
2 241 10019 2.04 M ( 0.82%)
Iterations: 67
Error: 8.17358e-09
[Profile: 11.650 s] (100.00%)
[ reading: 6.763 s] ( 58.05%)
[ setup: 0.253 s] ( 2.17%)
[ solve: 4.633 s] ( 39.77%)
Thanks for your reply sir. Among the sea of all options that I can apply, is there any basic rule I can refer to? I mean other than try the combinations of all options one by one, is there any thumb we can refer to? Any reference or links would be a great help.
Using block backend usually helps if you have the block-structured matrix, more often with speed of the setup and the solution than with convergence though. This means it does not improve the convergence by itself. Here it helped because the coarsening became implicitly aware of the block structure of the matrix. We can also tell amgcl that the matrix has a block structure with precond.coarsening.aggr.block_size=3
parameter:
./solver -A matrix.mtx -f prd_15.mtx \
precond.relax.type=damped_jacobi \
precond.coarsening.estimate_spectral_radius=1 precond.coarsening.power_iters=5 \
precond.coarsening.aggr.block_size=3
Solver
======
Type: BiCGStab
Unknowns: 219852
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.22
Grid complexity: 1.07
Memory footprint: 250.61 M
level unknowns nonzeros memory
---------------------------------------------
0 219852 8989578 214.08 M (81.63%)
1 14784 1939392 34.68 M (17.61%)
2 702 82962 1.84 M ( 0.75%)
Iterations: 82
Error: 7.75492e-09
[Profile: 15.126 s] (100.00%)
[ reading: 6.906 s] ( 45.66%)
[ setup: 0.554 s] ( 3.66%)
[ solve: 7.665 s] ( 50.67%)
Note that the number of iterations only slightly larger in this case, but the setup time and the solution time are about twice slower.
Next, it usually just a matter of trying out different options for smoother and (less importantly) solver.
By the way, here is your main_2.cpp file updated to use the above options: https://gist.github.com/ddemidov/027a525d30afdab2fb2a19da22516d76
The output is:
$ g++ -o main_2 main_2.cpp -O3 -I /usr/include/eigen3/ -I ~/work/amgcl -fopenmp
$ ./main_2
Solver
======
Type: BiCGStab
Unknowns: 73284
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.23
Grid complexity: 1.07
Memory footprint: 146.29 M
level unknowns nonzeros memory
---------------------------------------------
0 73284 998886 124.28 M (81.42%)
1 5063 217965 19.96 M (17.77%)
2 241 10019 2.04 M ( 0.82%)
67 8.17358e-09
[Profile: 9.623 s] (100.00%)
[ self: 4.742 s] ( 49.28%)
[ setup: 0.256 s] ( 2.66%)
[ solve: 4.625 s] ( 48.07%)
iteration is 3049
error is 9.76464e-16
time in solve is 23.3425 ms.
You are so kind sir. Thanks a lot~ I now can have a better understanding now~
Sir, I found that if the I using the ./solver --matrix /home/xuhui/Temp/matrix.mtx -r
, the system can be solved.
./solver --matrix /home/xuhui/Temp/matrix.mtx -r
Solver
======
Type: BiCGStab
Unknowns: 219852
Memory footprint: 11.74 M
Preconditioner
==============
Number of levels: 3
Operator complexity: 1.32
Grid complexity: 1.09
Memory footprint: 222.33 M
level unknowns nonzeros memory
---------------------------------------------
0 219852 8989578 172.15 M (75.66%)
1 18941 2740169 45.48 M (23.06%)
2 1054 151388 4.70 M ( 1.27%)
Iterations: 100
Error: 0.0233907
[Profile: 14.700 s] (100.00%)
[ reading: 8.399 s] ( 57.14%)
[ reorder: 0.039 s] ( 0.27%)
[ setup: 0.366 s] ( 2.49%)
[ solve: 5.892 s] ( 40.08%)
But we combine the -b3
with -r
the Segmentation fault can happen.
./solver --matrix /home/xuhui/Temp/matrix.mtx -b3 -r
./solver --matrix /home/xuhui/Temp/matrix.mtx -b3 -r
Segmentation fault (core dumped)
Do I did something wrong? options -b
can not combine with -r
?
-r
tells solver to reorder the matrix, and it rarely helps. In your case above, the solver did not converge: it just hit the default limit of 100 iterations. The error of 2e-2 is far from the default tolerance of 1e-8. Not sure why the segfault is happening (I can reproduce it too), but most probably you don't need reordering anyway.
Sir, thanks for your great job of amgcl.
I was using Eigen CG to solve the linear system, it not fast enough so I come to amgcl.
My matrix
A
and vectorf
comes from 3D solid finite element method simulation. TheA
is219852x219852
. It is positive definite.I have a code to compare performance using EigenCG and using amgcl.
Solving my problem, amgcl takes 70s and EigenCG takes 40s and even Eigen CG have a better accuracy. I believe that amgcl should have a better performance. I see in other disscussions that is is always 8-10 second to established grid hierarchy and 0-4 second to solve the problem. I do not know why my situation so slow. I have no idea how to tune it.
The matrix data, vector data, code, script are attached in follow links
https://drive.google.com/drive/folders/1vnJ7fdxqG4AdLOm3EwY0Ndj5F3t-JQvC?usp=sharing
I would greatly appreciate any suggestions you may have regarding optimal parameters and combination of options to help solve it faster.
Thanks for your time sir.
code
script
result