Closed auxpd closed 1 month ago
The image seems lost.
I've turned it into text, lol.
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 5 days since being marked as stale.
Describe the bug
I was unsuccessful in loading the model with the following parameters, and the latest version of xinference is 0.8.1
I started qwen-chat on the ui pagemodel format: gptq model size: 14 quantization: int4 n-gpu: 2
console error log: File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/models/qwen.py", line 231, in init self.transformer = QWenModel(config, linear_method) File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/models/qwen.py", line 193, in init self.h = nn.ModuleList([ File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/models/qwen.py", line 194, in
QWenBlock(config, linear_method)
File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/models/qwen.py", line 147, in init
self.mlp = QWenMLP(config.hidden_size,
File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/models/qwen.py", line 49, in init
self.c_proj = RowParallelLinear(intermediate_size,
File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/layers/linear.py", line 495, in init
self.linear_weights = self.linear_method.create_weights(
File "/home/auxpd/miniconda3/envs/xinference/lib/python3.10/site-packages/vllm/model_executor/layers/quantization/gptq.py", line 100, in create_weights
raise ValueError(
ValueError: [address=0.0.0.0:39287, pid=33034] The input size is not aligned with the quantized weight shape. This can be caused by too large tensor parallel size.