xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.43k stars 440 forks source link

Could not download qwen2-moe-instruct q4_k_m automatically #1951

Open Tint0ri opened 3 months ago

Tint0ri commented 3 months ago

System Info / 系統信息

Ubuntu 20.04

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

正常启动模式

Reproduction / 复现过程

  1. 选择qwen2-moe-instruct,llama.cpp, ggufv2, 14, Q4_K_M,
  2. Launch
  3. 很短时间,UI报错,Server error: 400 - [address=0.0.0.0:40875, pid=194] Model path does not exist: /data/cache/qwen2-moe-instruct-ggufv2-14b/qwen2-57b-a14b-instruct-q4_k_m.gguf
  4. 后台Log:ValueError: [address=0.0.0.0:40875, pid=194] Model path does not exist: /data/cache/qwen2-moe-instruct-ggufv2-14b/qwen2-57b-a14b-instruct-q4_k_m.gguf
  5. /data/cache/qwen2-moe-instruct-ggufv2-14b目录下只有 __valid_download_q4_k_m文件

Expected behavior / 期待表现

下载模型并用llama.cpp后端启动。 ollama可以正常下载模型并启动。

qinxuye commented 3 months ago

Can you remove /data/cache/qwen2-moe-instruct-ggufv2-14b and try again?

Tint0ri commented 3 months ago

Clear cache and try again with q4_k_m, still not work. Maybe the same error related with https://github.com/xorbitsai/inference/issues/1906

qinxuye commented 3 months ago

Did you choose modelscope as downloading hub?

Tint0ri commented 3 months ago

same error with modelscope downloading hub. ValueError: [address=0.0.0.0:37497, pid=170] Model path does not exist: /data/cache/qwen2-moe-instruct-ggufv2-14b/qwen2-57b-a14b-instruct-q4_k_m.gguf

already update docker image to latest version. 0.14.1