issues
search
ROCm
/
Tensile
Stretching GPU performance for GEMMs and tensor contractions.
MIT License
218
stars
147
forks
source link
enable VgprForLocalReadPacking + PrefetchLocalRead=1
#1864
Closed
nakajee
closed
9 months ago
nakajee
commented
9 months ago
removed the reject condition for VFLRP + PLR=1
added test cases for VFLRP + PLR=1