ROCm / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
11 stars 7 forks source link

Add support to gfx1101/1102 #79

Closed aska-0096 closed 10 months ago

aska-0096 commented 10 months ago

Test past on 7700XT (Navi32) with 8~9 it/s. (PwrCap=130W)