pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.58k stars 508 forks source link

Allow small modes to work with convert_hf_checkpoint. Added TinyLLama to the model list #52

Open briandw opened 9 months ago

briandw commented 9 months ago

Small models in HF don't have pytorch_model.bin.index.json files, since they are unnecessary. I changed the convert_hf_checkpoint.py to allow a single pytorch_model.bin file as the model description. I added PY007/TinyLlama-1.1B-intermediate-step-480k-1T to the the model list since it's in the speculate_7B_int4.sh script.

TinyLLama now works with the exception that weights_only would have to be changed to True on line 74 of convert_hf_checkpoint.py. I'll leave that up for discussion since it's less secure.