Cannot load models saved with HF transformers due to shared tensors in safetensors

turboderp / exllamav2

A fast inference library for running LLMs locally on modern consumer-class GPUs

MIT License

3.19k stars 234 forks source link

I'm trying to open a checkpoint that was saved from huggingface but it fails

model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen1.5-0.5B')
model.save_pretrained("testmodel")

Then when opening it with exllamav2 I get the error:

    raise ValueError(f" ## Could not find {prefix}.* in model")
ValueError:  ## Could not find lm_head.* in model

In the safetensors file there is no more lm_head attribute. I believe this is due to the torch shared tensors functionality: https://huggingface.co/docs/safetensors/torch_shared_tensors

The expected behavior is that exllamav2 should be able to load the checkpoints saved by huggingface transformers

turboderp / exllamav2