xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
3.43k stars 285 forks source link

BUG 20G以上模型导致机器重启 #1662

Open tsiens opened 1 week ago

tsiens commented 1 week ago

Describe the bug

运行20G以上模型,机器就会自动重启

To Reproduce

To help us to reproduce this bug, please provide information below:

  1. Your Python version.
  2. The version of xinference you use. docker v0.12.0
  3. Versions of crucial packages. 8*4090 24G docker info Client: Version: 25.0.5 Context: default Debug Mode: false

Server: Containers: 4 Running: 2 Paused: 0 Stopped: 2 Images: 4 Server Version: 25.0.5 Storage Driver: overlay2 Backing Filesystem: extfs Supports d_type: true Using metacopy: false Native Overlay Diff: true userxattr: false Logging Driver: json-file Cgroup Driver: cgroupfs Cgroup Version: 1 Plugins: Volume: local Network: bridge host ipvlan macvlan null overlay Log: awslogs fluentd gcplogs gelf journald json-file local splunk syslog Swarm: inactive Runtimes: runc io.containerd.runc.v2 Default Runtime: runc Init Binary: docker-init containerd version: 7c3aca7a610df76212171d200ca3811ff6096eb8 runc version: v1.1.12-0-g51d5e94 init version: de40ad0 Security Options: seccomp Profile: builtin Kernel Version: 3.10.0-1160.el7.x86_64 Operating System: CentOS Linux 7 (Core) OSType: linux Architecture: x86_64 CPUs: 144 Total Memory: 755.1GiB

  1. Full stack of the error.
  2. Minimized code to reproduce the error.

Expected behavior

小模型都正常加载使用,如qwen2,完全没问题 1、20G以上模型,比如glm4,指定双卡,9997网页问答,胡言乱语几次,然后重启 2、指定单卡,正确返回几次,然后重启 3、docker安装ollama,正常使用,因此觉得不是机器、docker的问题

Additional context

Add any other context about the problem here.

qinxuye commented 1 week ago

用的什么引擎?

tsiens commented 1 week ago

用的什么引擎?

transformer、vllm都试过,自定义注册和自带的语言模型都用过

tsiens commented 1 week ago

尝试使用anaconda创建env环境,然后本地启动xinference-loca,同样重启了

qinxuye commented 4 days ago

问题有解决吗?

tsiens commented 3 days ago

问题有解决吗?

没有,查不出原因