pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.35k stars 484 forks source link

Fixes for eval and GPTQ after move to gpt-fast #93

Closed HDCharles closed 5 months ago

HDCharles commented 5 months ago

Stack from ghstack (oldest at bottom):

Summary: the move from simple_gpt to gpt-fast altered some things. This unbreaks eval and GPTQ.

Note GPTQ still is broken due to kv cache issue in model. Needs either non-public pytorch functionality or a change to GPTQ implementation. see next PR in stack for a fix.

Test Plan:

python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5

Reviewers:

Subscribers:

Tasks:

Tags: