Closed AntBlo closed 2 weeks ago
That's actually expected: phi v3 is a llama architecture and not at all the same as phi 1 and 2, they removed all the specificities of the model (biases, parallel mlp, etc) so it was far easier to plug in the quantized-llama example - and the same is done in llama.cpp where the model is reported as a llama model rather than a phi one.
Aah, okay. Sorry for the confusion ^^; And thanks for the explanation! Closing this
quantized_llama
seems wrong here in thequantized-phi
model: https://github.com/huggingface/candle/blob/cfab6e761696c18b1ce5d3a339ab57ef191ca749/candle-examples/examples/quantized-phi/main.rs#L16 Shouldn't it be fromcandle_transformers::models::quantized_phi
? Haven't tested it, but saw it while reading