pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

Enable TinyLLAMAs quantization #151

Closed malfet closed 2 months ago

malfet commented 2 months ago

Copy-n-paste code from https://github.com/pytorch-labs/gpt-fast/commit/11ce176d48a60e0682c817114caab37070c6a7ba into quantize.py