cyclops-community / ctf

Cyclops Tensor Framework: parallel arithmetic on multidimensional arrays
Other
201 stars 54 forks source link

unexpected performance for SpMV operation #140

Open rohany opened 2 years ago

rohany commented 2 years ago

I'm running some CTF SpMV kernels using the contraction interface (a["i"] = B["ij"] * c["j"]), and I'm seeing performance that is nearly 2 orders of magnitude slower than systems like PETSc and Trilinos on large matrices from the suitesparse collection (such as the arabic-2005 graph). I know I've asked similar questions to this before, but I was wondering if such a discrepancy is expected, or if there is a specialized kernel available in CTF for the SpMV operation. I believe I've configured CTF correctly for my system, but I'm happy to share the configuration logs to double check.

cc @solomonik

raghavendrak commented 2 years ago

CTF treats vectors no different from tensors, and the code has to account for generalized redistribution and storage format. We did use specialized code to handle vectors without overheads in a different project, and found the performance gains to be quite significant. We will integrate this into CTF soon, and tag here when it is done.