huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
15.81k stars 950 forks source link

Quantized Phi-3 example fails "cannot find llama.attention.head_count in metadata" #2154

Open MoonKraken opened 6 months ago

MoonKraken commented 6 months ago
cargo run --example quantized-phi --release -- --prompt "what is the best thing about rust?" --which phi-3
   Compiling candle-examples v0.5.0 (/Users/kenk/Documents/Code/OpenSource/candle/candle-examples)
    Finished release [optimized] target(s) in 3.21s
     Running `target/release/examples/quantized-phi --prompt 'what is the best thing about rust?' --which phi-3`
avx: false, neon: true, simd128: false, f16c: false
temp: 0.80 repeat-penalty: 1.10 repeat-last-n: 64
Running on CPU, to run on GPU(metal), build this example with `--features metal`
loaded 195 tensors (2.39GB) in 0.08s
Error: cannot find llama.attention.head_count in metadata

hardware: M1 macbook

This issue does not occur when using phi-2 or with any other example that I've tried. It also still occurs even with the metal feature enabled.

socathie commented 6 months ago

I believe this is not a candle issue. I have downloaded the model a few days ago and has no error running, while my colleague has the same error as @MoonKraken mentioned.

Upon investigation, I found that my model file has a SHA256 hash of 1cd9a9df07350196623f93bf4829cf228959e07ad32f787b8fdd7f5956f5b9de but his is 8a83c7fb9049a9b2e92266fa7ad04933bb53aa1e85136b7b30f1b8000ff2edef - his hash matches the one currently on the model page.

Googling my hash, I found mine on a branch that doesn't exist on the model page here and seems like the author has also continued to push to this branch, suggesting that this is the correct one.

This might be a serious issue, suggesting that the commit has somehow "switched" to another one (the wrong one) without the author knowing it.

LaurentMazare commented 6 months ago

Thanks for looking into this, that seems pretty odd for the hash to be modified like this. I've just modified the example code in #2156 so that it forces the use of the separate branch that you mentioned so hopefully that will fix it for your colleague and others.

socathie commented 6 months ago

Thanks - this is a great temporary solution. Wondering if there is any way we can also flag to the huggingface team about this?

LaurentMazare commented 6 months ago

Yeah not sure what they would think of this. Anyway I've also put together #2157 which uses the new naming convention with an implementation closer to the phi-3 codebase rather than re-using the llama ones so hopefully we should also be good to cover upcoming phi-3 models if they use the current "main" branch.