【BUG】xinference升级0.12.2后运行glm4v出现OOM - Githubissues

xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.

https://inference.readthedocs.io

Apache License 2.0

5.05k stars 402 forks source link

【BUG】xinference升级0.12.2后运行glm4v出现OOM #1712

Open Yog-AI opened 3 months ago

Yog-AI commented 3 months ago

模型是glm4-v-9b，显卡是3090和4090 启动命令： xinference launch --model-engine Transformers --model-name glm-4v --size-in-billions 9 --model-format pytorch --quantization none

问题描述： xinference刚刚升级到0.12.2版本后，3090和4090同时出现OOM（单机单卡），但在升级之前，在两台机器上都是正常的运行。

QUNING1 commented 3 months ago

我A800 80G运行glm4 也会爆难绷 0.12.2

Yog-AI commented 3 months ago

我A800 80G运行glm4 也会爆难绷 0.12.2

是的，我200%确认在pip install xinference -U 之前，glm4v是正常运行且运行了一批识别任务的。

Yog-AI commented 3 months ago

我无法探明OOM问题出现的原因，只能给后面遇到相似问题的人一个参考。解决办法：回退xinference版本，重新创建一个conda虚拟环境，然后安装：pip install "xinference[all]==0.12.0" 然后就能运行glm4v模型了

MiningIrving commented 3 months ago

+1，3090 24G 设置N-GPU=4，但只使用第一张卡然后OOM

ChengjieLi28 commented 3 months ago

+1，3090 24G 设置N-GPU=4，但只使用第一张卡然后OOM

多卡问题glm4v他们官方改过一次代码，我们近期会升级huggingface模型版本号用以适配。modelscope下载的需要删掉模型再下载一遍应该就是他们最新修复的代码。

zhangever commented 2 months ago

我无法探明OOM问题出现的原因，只能给后面遇到相似问题的人一个参考。解决办法：回退xinference版本，重新创建一个conda虚拟环境，然后安装：pip install "xinference[all]==0.12.0" 然后就能运行glm4v模型了

感谢，我也遇到同样问题，折腾了半天，最后用大佬这种方法才算解决。

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 7 days with no activity.