issues
search
cp2k
/
dbcsr
DBCSR: Distributed Block Compressed Sparse Row matrix library
https://cp2k.github.io/dbcsr/
GNU General Public License v2.0
134
stars
46
forks
source link
ocl: fixes and improvements
#728
Closed
hfp
closed
8 months ago
hfp
commented
8 months ago
tune_multiply.py
Save good intermediate results after an earlier kernel failed.
Fixed potential type-conversion error.
Improved error message.
SMM-kernel
Repurposed 1<LU, and improved code-path using general blocks.
Folded some explicit control-flow into loop-condition.
Use 32-bit integer variables consistently.
Removed superfluous casts.
Improved comments.
tune_multiply.py
SMM-kernel