ricosjp / monolish

monolish: MONOlithic LInear equation Solvers for Highly-parallel architecture
Apache License 2.0
197 stars 12 forks source link

benchmark CUDA11.x generic API SpMM in double precision #76

Closed t-hishinuma closed 3 years ago

t-hishinuma commented 3 years ago

benchmark SpMM using benchmark/matrix/matmul_gpu

cusparse 11.4 impl(new)

matmul(Dense,CRS,Dense) CRS     double  1000    1000    1000    0.00112888      143.505
matmul(Dense,CRS,Dense) CRS     double  1500    1500    1500    0.00241296      151.059
matmul(Dense,CRS,Dense) CRS     double  2000    2000    2000    0.00380715      170.206
matmul(Dense,CRS,Dense) CRS     double  2500    2500    2500    0.00531897      190.356
matmul(Dense,CRS,Dense) CRS     double  3000    3000    3000    0.0073412       198.605

monolish impl. (old)

matmul(Dense,CRS,Dense) CRS     double  1000    1000    1000    0.0274129       5.90962
matmul(Dense,CRS,Dense) CRS     double  1500    1500    1500    0.0417931       8.72153
matmul(Dense,CRS,Dense) CRS     double  2000    2000    2000    0.0557229       11.629
matmul(Dense,CRS,Dense) CRS     double  2500    2500    2500    0.070489        14.3639
matmul(Dense,CRS,Dense) CRS     double  3000    3000    3000    0.0923186       15.7931
t-hishinuma commented 3 years ago

72 is done.