Open CaptainOfHacks opened 6 months ago
@CaptainOfHacks thank you I'll take a look, can you link to a huggingface model I can use for testing this?
Forgive me if I'm intruding but this one looks like it'll work: https://huggingface.co/xtuner/llava-phi-3-mini-gguf
yes, support for this model we want, thank you @xBelladonna for reference to huggingface.
some updates about phi3 template? @abetlen
I can't get phi-3 to work with -ngl flag. It seems that offloading any layers results in a crash in llama_decode_internal
llava-phi-3-mini uses the Phi-3-instruct chat template. I think is similar with current llava-1-5, but with Phi3 instruct template instead of llama 2.
format:
<|user|>\nQuestion <|end|>\n<|assistant|>
stop word is <|end|> for system use: <|system|>I think you can adapt easily llava-1-5 handler for phi3:
Take a look: @abetlen