Summary: the move from simple_gpt to gpt-fast altered some things. This
unbreaks eval and GPTQ.
Note GPTQ still is broken due to kv cache issue in model. Needs either
non-public pytorch functionality or a change to GPTQ implementation. see
next PR in stack for a fix.
Stack from ghstack (oldest at bottom):
83
91
Summary: the move from simple_gpt to gpt-fast altered some things. This unbreaks eval and GPTQ.
Note GPTQ still is broken due to kv cache issue in model. Needs either non-public pytorch functionality or a change to GPTQ implementation. see next PR in stack for a fix.
Test Plan:
python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5
Reviewers:
Subscribers:
Tasks:
Tags: