Closed 401557122 closed 5 days ago
@amumu96 Could you look at this issue.
engine Transformers
does not support AWQ format, and AWQ format should not launch by engine Transformers
. Could you tell me how do you launch it? If you want to launch AWQ format internvl2, please use engine LMDEPLOY
engine
Transformers
does not support AWQ format, and AWQ format should not launch by engineTransformers
. Could you tell me how do you launch it? If you want to launch AWQ format internvl2, please use engineLMDEPLOY
Just tried this with no success, returns error RuntimeError: Failed to launch model, detail: [address=0.0.0.0:58248, pid=3691355] Model internvl2 cannot be run on engine LMDEPLOY.
I used this command to launch: xinference launch --model-engine LMDEPLOY --model-name internvl2 --size-in-billions 76 --model-format awq --quantization Int4
Did you install lmdeploy?
Did you install lmdeploy?
Confirmed it's running after installing lmdeploy, thanks!
OK, close this issue then.
System Info / 系統信息
{ "version": 1, "context_length": 32000, "model_name": "InternVL2-Llama3-76B-AWQ", "model_lang": [ "en", "zh" ], "model_ability": [ "generate", "vision", "chat" ], "model_description": "", "model_family": "other", "model_specs": [ { "model_format": "awq", "model_size_in_billions": 76, "quantizations": [ "4-bit" ], "model_id": null, "model_hub": "huggingface", "model_uri": "/root/.xinference/InternVL2-Llama3-76B-AWQ", "model_revision": null } ], "prompt_style": null, "is_builtin": false }
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
lastest
The command used to start Xinference / 用以启动 xinference 的命令
docker run -dit -v /data/ez/llms:/root/.xinference -e XINFERENCE_HOME=/root/.xinference -p 9999:9997 --gpus all --shm-size 20g --ipc=host aicenter/xinference:latest xinference-local -H 0.0.0.0 --log-level debug
Reproduction / 复现过程
File "/usr/local/lib/python3.10/dist-packages/xinference/model/llm/pytorch/core.py", line 771, in _get_full_prompt assert self.model_family.prompt_style is not None
Expected behavior / 期待表现
how to config InternVL2-Llama3-76B-AWQ