Reserch the effect of the level information of the performance of cusparse ILU precondition

t-hishinuma commented 2 years ago

The level information may not improve the performance but spend extra time doing analysis. For example, a tridiagonal matrix has no parallelism. In this case, CUSPARSE_SOLVE_POLICY_NO_LEVEL performs better than CUSPARSE_SOLVE_POLICY_USE_LEVEL. If the user has an iterative solver, the best approach is to do csrsv2_analysis() with CUSPARSE_SOLVE_POLICY_USE_LEVEL once. Then do csrsv2_solve() with CUSPARSE_SOLVE_POLICY_NO_LEVEL in the first run and with CUSPARSE_SOLVE_POLICY_USE_LEVEL in the second run, picking faster one to perform the remaining iterations.

https://docs.nvidia.com/cuda/cusparse/index.html#csric02

t-hishinuma commented 2 years ago

I use 50x50x50 cube mesh that is created by FEM (nnz/row is about 81)

NO_LEVEL is 0
USE_LEVEL is 1

M	L	U	analysis in create_precon	solve in apply_precon
0	0	0	0.082	0.934
0	0	1	0.473	0.421
0	1	0	0.436	0.524
0	1	1	0.803	0.018
1	0	0	0.069	1.255
1	0	1	0.438	0.422
1	1	0	0.421	0.531
1	1	1	0.785	0.018

t-hishinuma commented 2 years ago

I use USE_LEVEL

ricosjp / monolish

Reserch the effect of the level information of the performance of cusparse ILU precondition #90