xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
3.57k stars 296 forks source link

QUESTION: Chinese garbled code problem #708

Open xiaolibuzai-ovo opened 7 months ago

xiaolibuzai-ovo commented 7 months ago

Note that the issue tracker is NOT the place for general support.

I deployed a model, but encountered the problem of garbled Chinese, what is the reason? such as:

image

this is my model.json

{
    "version":1,
    "context_length":2048,
    "model_name":"Llama2-Chinese-7b-Chat",
    "model_lang":[
        "zh"
    ],
    "model_ability":[
        "chat"
    ],
    "model_description":"This is a custom model description.",
    "model_specs":[
        {
            "model_format":"pytorch",
            "model_size_in_billions":7,
            "quantizations":[
                "4-bit",
                "8-bit",
                "none"
            ],
            "model_id":"FlagAlpha/Llama2-Chinese-7b-Chat",
            "model_uri":"file:///home/mingzhi/xInference/FlagAlpha/Llama2-Chinese-7b-Chat"
        }
    ],
    "prompt_style":{
        "style_name":"LLAMA2",
        "system_prompt":"<s>[INST] <<SYS>>\nYou are a helpful AI assistant.\n<</SYS>>\n\n",
        "roles":[
            "[INST]",
            "[/INST]"
        ],
        "intra_message_sep":" ",
        "inter_message_sep":" </s><s>",
        "stop_token_ids":[
            2
        ],
        "stop":[
            "</s>"
        ]
    }
}

this is huggingface url: https://huggingface.co/FlagAlpha/Llama2-Chinese-7b-Chat

ikun52099 commented 5 months ago

请问您有Llama2-Chinese-7b-Chat的vocab.txt文件的文件吗