ROCm / AITemplate

AITemplate is a Python framework which renders neural network into high performance CUDA/HIP C++ code. Specialized for FP16 TensorCore (NVIDIA GPU) and MatrixCore (AMD GPU) inference.
Apache License 2.0
11 stars 7 forks source link

ETA on upstreaming gfx11 support? #68

Open wsippel opened 1 year ago

wsippel commented 1 year ago

I've noticed the navi3_rel_ver_1.0 branch sitting in the repo for a while now, is there an ETA on when this might end up upstream? I've played around with it for a bit on my 7900XTX, and it appears to be the fastest way to run Stable Diffusion on AMD hardware right now. I'd be interested in compiling and providing AIT modules for StabilityAI's ComfyUI (AIT on Nvidia is already supported), but the navi3 branch doesn't currently include SDXL support.

Boom-Hacker commented 1 year ago

you run on 7900xtx successed?

Boom-Hacker commented 10 months ago

I've noticed the navi3_rel_ver_1.0 branch sitting in the repo for a while now, is there an ETA on when this might end up upstream? I've played around with it for a bit on my 7900XTX, and it appears to be the fastest way to run Stable Diffusion on AMD hardware right now. I'd be interested in compiling and providing AIT modules for StabilityAI's ComfyUI (AIT on Nvidia is already supported), but the navi3 branch doesn't currently include SDXL support.

https://github.com/rocmunofficial/AITemplate-navi3_new/tree/navi3_merge2