pytorch-labs / gpt-fast

Simple and efficient pytorch-native transformer text generation in <1000 LOC of python.
BSD 3-Clause "New" or "Revised" License
5.34k stars 484 forks source link

Hard-coded Llama-3 model name pattern matching breaks scripts/convert_hf_checkpoint.py #177

Open ephremw opened 1 month ago

ephremw commented 1 month ago

This line makes the HF-to-gptfast converter treat CodeLlama-34b as a Llama-3 model, which results in errors. https://github.com/pytorch-labs/gpt-fast/blob/main/scripts/convert_hf_checkpoint.py#L37

is_llama3 = "Llama-3" in model_name
File "gpt-fast/scripts/convert_hf_checkpoint.py", line 43, in convert_hf_checkpoint
    bin_files = [bin for bin in original_dir.iterdir() if pattern.match(bin.name)]
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^