Fixes for eval and GPTQ after move to gpt-fast

Stack from ghstack (oldest at bottom):

83
91
-> #82

Summary: the move from simple_gpt to gpt-fast altered some things. This unbreaks eval and GPTQ.

Note GPTQ still is broken due to kv cache issue in model. Needs either non-public pytorch functionality or a change to GPTQ implementation. see next PR in stack for a fix.

Test Plan:

python eval.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth python quantize.py --checkpoint_path checkpoints/$MODEL_REPO/model.pth --mode int4-gptq --calibration_tasks wikitext --calibration_limit 5

Reviewers:

Subscribers:

Tasks:

Tags:

pytorch-labs / gpt-fast

Fixes for eval and GPTQ after move to gpt-fast #82

83

91