Open Lyutoon opened 2 days ago
You need to set the experts metadata for llama.cpp to use the experts tensors, otherwise it will ignore these tensors and lead to this error. attn_rot_embd
should also be removed.
Oh! thanks for the reply. How can I use these experts metadata in gguf-py ?
Look into the way convert_hf_to_gguf.py
does it.
Thanks! I’ll have a look and try!
What happened?
Hi, recently, I'm trying to learn the gguf-py lib and use the gruff-py and write a script to make a gguf file, after I made the file, I tried to load it using llama-cli, but it said I have the wrong tensor number. So I'm wondering if there are some inconsistencies between the cpp loader and the py loader.
Here, my script is:
then, you can see the result like:
As we can see that the py-reader identified 18 tensors while llama-cli only know that there are 18 tensors but only identified 13 tensors.
I'm wondering what's going wrong with my script. Could you please help me to figure out? Thanks a lot!
Name and Version
build: 3909 (11ac9800) with cc (Ubuntu 9.4.0-1ubuntu1~20.04.2) 9.4.0 for x86_64-linux-gnu
What operating system are you seeing the problem on?
Linux
Relevant log output
No response