使用 Qwen-VL-Chat - Githubissues

RuntimeError: Failed to generate chat completion, detail: [address=127.0.0.1:37257, pid=50009] Invalid prompt style:

这是我的配置 curl 'http://127.0.0.1:9997/v1/model_registrations/LLM' \ -H 'Accept: /' \ -H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8' \ -H 'Connection: keep-alive' \ -H 'Content-Type: application/json' \ -H 'Cookie: token=no_auth' \ -H 'Origin: http://127.0.0.1:9997' \ -H 'Referer: http://127.0.0.1:9997/ui/' \ -H 'Sec-Fetch-Dest: empty' \ -H 'Sec-Fetch-Mode: cors' \ -H 'Sec-Fetch-Site: same-origin' \ -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \ -H 'sec-ch-ua: "Chromium";v="124", "Google Chrome";v="124", "Not-A.Brand";v="99"' \ -H 'sec-ch-ua-mobile: ?0' \ -H 'sec-ch-ua-platform: "Linux"' \ --data-raw '{"model":"{\"version\":1,\"model_name\":\"Qwen-VL-Chat-Int4\",\"model_description\":\"Qwen-VL-Chat-Int4\",\"context_length\":2048,\"model_lang\":[\"en\",\"zh\"],\"model_ability\":[\"generate\",\"chat\",\"vision\"],\"model_family\":\"other\",\"model_specs\":[{\"model_uri\":\"/home/wb/models/Qwen-VL-Chat-Int4\",\"model_size_in_billions\":9,\"model_format\":\"pytorch\",\"quantizations\":[\"none\"]}],\"prompt_style\":{\"style_name\":\"default\",\"system_prompt\":\"\",\"roles\":[\"user\",\"assistant\"]}}","persist":true}'

这是启动的 curl 'http://127.0.0.1:9997/v1/models' \ -H 'Accept: /' \ -H 'Accept-Language: zh-CN,zh;q=0.9,en;q=0.8' \ -H 'Connection: keep-alive' \ -H 'Content-Type: application/json' \ -H 'Cookie: token=no_auth' \ -H 'Origin: http://127.0.0.1:9997' \ -H 'Referer: http://127.0.0.1:9997/ui/' \ -H 'Sec-Fetch-Dest: empty' \ -H 'Sec-Fetch-Mode: cors' \ -H 'Sec-Fetch-Site: same-origin' \ -H 'User-Agent: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36' \ -H 'sec-ch-ua: "Chromium";v="124", "Google Chrome";v="124", "Not-A.Brand";v="99"' \ -H 'sec-ch-ua-mobile: ?0' \ -H 'sec-ch-ua-platform: "Linux"' \ --data-raw '{"model_uid":"Qwen-VL-Chat-Int4","model_name":"Qwen-VL-Chat-Int4","model_type":"LLM","model_engine":"Transformers","model_format":"pytorch","model_size_in_billions":9,"quantization":"none","n_gpu":"auto","replica":1,"request_limits":null,"worker_ip":null,"gpu_idx":null}'

然后启动xinfer 进行图文问答就出现了 RuntimeError: Failed to generate chat completion, detail: [address=127.0.0.1:37257, pid=50009] Invalid prompt style:

chatchat-space / Langchain-Chatchat

使用 Qwen-VL-Chat #4697