abetlen / llama-cpp-python

Python bindings for llama.cpp
https://llama-cpp-python.readthedocs.io
MIT License
8.21k stars 979 forks source link

Add support for llava-1-5-phi3-mini #1443

Open CaptainOfHacks opened 6 months ago

CaptainOfHacks commented 6 months ago

llava-phi-3-mini uses the Phi-3-instruct chat template. I think is similar with current llava-1-5, but with Phi3 instruct template instead of llama 2.

format: <|user|>\nQuestion <|end|>\n<|assistant|> stop word is <|end|> for system use: <|system|>

I think you can adapt easily llava-1-5 handler for phi3:

class Llava15ChatHandler:
    DEFAULT_SYSTEM_MESSAGE: Optional[str] =  "A chat between a curious human and an artificial intelligence assistant.  The assistant gives helpful, detailed, and polite answers to the human's questions."

    CHAT_FORMAT = (
        "{% for message in messages %}"
        "{% if message.role == 'system' %}"
        "{{ message.content }}"
        "{% endif %}"
        "{% if message.role == 'user' %}"
        "{% if message.content is string %}"
        "\nUSER: {{ message.content }}"
        "{% endif %}"
        "{% if message.content is iterable %}"
        "\nUSER: "

        "{% for content in message.content %}"
        "{% if content.type == 'image_url' and content.image_url is string %}"
        "{{ content.image_url }}"
        "{% endif %}"
        "{% if content.type == 'image_url' and content.image_url is mapping %}"
        "{{ content.image_url.url }}"
        "{% endif %}"
        "{% endfor %}"

        "{% for content in message.content %}"
        "{% if content.type == 'text' %}"
        "{{ content.text }}"
        "{% endif %}"
        "{% endfor %}"

        "{% endif %}"
        "{% endif %}"
        "{% if message.role == 'assistant' and message.content is not none %}"
        "\nASSISTANT: {{ message.content }}"
        "{% endif %}"
        "{% endfor %}"
        "{% if add_generation_prompt %}"
        "\nASSISTANT: "
        "{% endif %}"
    )

Take a look: @abetlen

abetlen commented 6 months ago

@CaptainOfHacks thank you I'll take a look, can you link to a huggingface model I can use for testing this?

xBelladonna commented 6 months ago

Forgive me if I'm intruding but this one looks like it'll work: https://huggingface.co/xtuner/llava-phi-3-mini-gguf

CaptainOfHacks commented 6 months ago

yes, support for this model we want, thank you @xBelladonna for reference to huggingface.

CaptainOfHacks commented 6 months ago

some updates about phi3 template? @abetlen

themanyone commented 6 months ago

I can't get phi-3 to work with -ngl flag. It seems that offloading any layers results in a crash in llama_decode_internal