ROCm / Tensile

Stretching GPU performance for GEMMs and tensor contractions.
MIT License
200 stars 137 forks source link