Closed AndrewRyanChama closed 2 weeks ago
You can try converting the model with the -unshare
flag using the util/convert_safetensors.py script. ExLlama does support tied embeddings but I didn't enable it for Qwen because all the official Qwen releases seem to not actually use any shared tensors. Even though this is set to True for the release model, the .safetensors file that ships with that model actually has separate embedding and head tensors.
I'm trying to open a checkpoint that was saved from huggingface but it fails
Then when opening it with exllamav2 I get the error:
In the safetensors file there is no more lm_head attribute. I believe this is due to the torch shared tensors functionality: https://huggingface.co/docs/safetensors/torch_shared_tensors
The expected behavior is that exllamav2 should be able to load the checkpoints saved by huggingface transformers