marella / ctransformers

Python bindings for the Transformer models implemented in C/C++ using GGML library.
MIT License
1.79k stars 136 forks source link

falcon.cpp: tensor 'lm_head.weight' is missing from model #81

Open lppllppl920 opened 1 year ago

lppllppl920 commented 1 year ago

Whenever I try to load a QLoRA-merged Falcon 40B model, the error below happens. error loading model: falcon.cpp: tensor 'lm_head.weight' is missing from model

The hacky way I did to resolve this error is to replace the line at https://github.com/marella/ctransformers/blob/main/models/ggml/libfalcon.cpp#L1665 with model.lm_head = model.tok_embeddings;

Is it possible to fix this issue so that it does not always expect to load lm_head.weight from the Falcon40B model?

marella commented 1 year ago

I don't think it is correct to change it. Which scripts are you using for converting? Can you please try running the model using https://github.com/cmp-nct/ggllm.cpp and see if throws the same error.

lppllppl920 commented 1 year ago

I think it is actually a known issue. It is fixed in text-generation-inference at https://github.com/huggingface/text-generation-inference/pull/501#issuecomment-1663841922 with PR https://github.com/huggingface/text-generation-inference/pull/762/files#diff-2111bae5f77d998a3fe39888906b3c7be122313241ed6b69b0b0baf5abb735bbL57.