ricosjp / monolish

monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Apache License 2.0
195 stars 12 forks source link

Reserch the effect of the level information of the performance of cusparse ILU precondition #90

Closed t-hishinuma closed 2 years ago

t-hishinuma commented 2 years ago

The level information may not improve the performance but spend extra time doing analysis. For example, a tridiagonal matrix has no parallelism. In this case, CUSPARSE_SOLVE_POLICY_NO_LEVEL performs better than CUSPARSE_SOLVE_POLICY_USE_LEVEL. If the user has an iterative solver, the best approach is to do csrsv2_analysis() with CUSPARSE_SOLVE_POLICY_USE_LEVEL once. Then do csrsv2_solve() with CUSPARSE_SOLVE_POLICY_NO_LEVEL in the first run and with CUSPARSE_SOLVE_POLICY_USE_LEVEL in the second run, picking faster one to perform the remaining iterations.

https://docs.nvidia.com/cuda/cusparse/index.html#csric02

t-hishinuma commented 2 years ago

I use 50x50x50 cube mesh that is created by FEM (nnz/row is about 81)

M L U analysis in create_precon solve in apply_precon
0 0 0 0.082 0.934
0 0 1 0.473 0.421
0 1 0 0.436 0.524
0 1 1 0.803 0.018
1 0 0 0.069 1.255
1 0 1 0.438 0.422
1 1 0 0.421 0.531
1 1 1 0.785 0.018
t-hishinuma commented 2 years ago

I use USE_LEVEL