pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

INT4 quantization not working on MI210 #154

Open yafehlis opened 2 months ago

yafehlis commented 2 months ago

INT8 quantization works fine, but INT4 does not work. Capture

Chillee commented 2 months ago

Yeah, int4 quantization doesn't work on AMD GPUs right now.