huggingface / candle

Minimalist ML framework for Rust
Apache License 2.0
13.79k stars 751 forks source link

quantized_llama ModelWeights path used in quantized Phi3 #2125

Closed AntBlo closed 2 weeks ago

AntBlo commented 2 weeks ago

quantized_llama seems wrong here in the quantized-phi model: https://github.com/huggingface/candle/blob/cfab6e761696c18b1ce5d3a339ab57ef191ca749/candle-examples/examples/quantized-phi/main.rs#L16 Shouldn't it be from candle_transformers::models::quantized_phi? Haven't tested it, but saw it while reading

LaurentMazare commented 2 weeks ago

That's actually expected: phi v3 is a llama architecture and not at all the same as phi 1 and 2, they removed all the specificities of the model (biases, parallel mlp, etc) so it was far easier to plug in the quantized-llama example - and the same is done in llama.cpp where the model is reported as a llama model rather than a phi one.

AntBlo commented 2 weeks ago

Aah, okay. Sorry for the confusion ^^; And thanks for the explanation! Closing this