xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.83k stars 383 forks source link

用户调用多,出现bug,到2500后自动卡死 #2001

Open stevensy123 opened 1 month ago

stevensy123 commented 1 month ago

System Info / 系統信息

Ubuntu20.04 CUDA12.2.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.13.0

The command used to start Xinference / 用以启动 xinference 的命令

官方docker启动命令

Reproduction / 复现过程

log文件中的内容 -06-26 01:57:22,996 xoscar.backends.core 1 WARNING Actor caller has created too many clienta (1750 >= 100), the global router may not be set.2024-06-26 02:00:53,354 xoscar.backends.core 1 WARNING Actor caller has created too many clients (1760 >= 100), the global router may not be set. 用户调用数量多后 就会出现此问题,Clients到2500,运行模型自动注销了,且没有释放显存。 是否是docker安装无法支持多用户调用,pip安装是否有此问题

Expected behavior / 期待表现

多用户情况下,正常使用

stevensy123 commented 1 month ago

@qinxuye 请大神指点下

codingl2k1 commented 1 month ago

有没有更多日志?

stevensy123 commented 1 month ago

@codingl2k1 日志文件里没有其他的警告和报错 只是不断的输出上述内容 每增加10个client就增加一行 现在不清楚100的限制是在哪里添加的 如何可以修改100的限制还有2500,其实不影响使用

masktone commented 1 month ago

xinference 类似情况

codingl2k1 commented 1 month ago

This InvalidStateError has been fixed by this PR: https://github.com/xorbitsai/xoscar/pull/87 Are you using the latest xinference?

stevensy123 commented 1 month ago

This InvalidStateError has been fixed by this PR: xorbitsai/xoscar#87 Are you using the latest xinference? v0.13.0,This problem does not affect the use. Too many clients is the problem I want to solve

michaelxu1107 commented 1 month ago

我用的0.10.3版本,日志里也经常看到类似于WARNING Actor caller has created too many clienta (1750 >= 100), the global router may not be set的告警日志,这是因为客户端请求完成后没有释放连接资源吗

michaelxu1107 commented 1 month ago

我用的0.10.3版本,日志里也经常看到类似于WARNING Actor caller has created too many clienta (1750 >= 100), the global router may not be set的告警日志,这是因为客户端请求完成后没有释放连接资源吗

用户并发很低的

qinxuye commented 1 month ago

这个问题我们一直没法重现,你们 pip list 下提供下版本。以及什么模型,什么引擎提供下。

michaelxu1107 commented 1 month ago

这个问题我们一直没法重现,你们 pip list 下提供下版本。以及什么模型,什么引擎提供下。

你好,我们这边使用的是python3.10.11版本,xinference使用的是0.10.3版本,模型使用的是Qwen1.5-32B-Chat,推理引擎使用的vllm,python依赖如下 accelerate==0.29.3 addict==2.4.0 aiobotocore==2.7.0 aiofiles==23.2.1 aiohttp==3.9.4 aioitertools==0.11.0 aioprometheus==23.12.0 aiosignal==1.3.1 aliyun-python-sdk-core==2.15.1 aliyun-python-sdk-kms==2.16.2 altair==5.3.0 annotated-types==0.6.0 anthropic==0.25.2 anyio==4.3.0 async-timeout==4.0.3 attrdict==2.0.1 attrs==23.2.0 auto_gptq==0.7.1 autoawq==0.2.3 autoawq_kernels==0.0.6 bcrypt==4.1.2 bitsandbytes==0.42.0 blessed==1.20.0 blinker==1.7.0 botocore==1.31.64 Brotli==1.1.0 certifi==2024.2.2 cffi==1.16.0 charset-normalizer==3.3.2 chatglm-cpp==0.3.1 click==8.1.7 cloudpickle==3.0.0 cmake==3.29.2 colorama==0.4.6 coloredlogs==15.0.1 ConfigArgParse==1.7 contourpy==1.2.1 controlnet-aux==0.0.7 crcmod==1.7 cryptography==42.0.5 cycler==0.12.1 Cython==3.0.10 dataclasses-json==0.6.4 datasets==2.19.0 diffusers==0.27.2 dill==0.3.8 diskcache==5.6.3 distro==1.9.0 ecdsa==0.19.0 einops==0.7.0 exceptiongroup==1.2.1 fastapi==0.110.2 ffmpy==0.3.2 filelock==3.13.4 FlagEmbedding==1.2.9 Flask==3.0.3 Flask-Cors==4.0.0 Flask-Login==0.6.3 fonttools==4.51.0 frozenlist==1.4.1 fsspec==2023.10.0 gast==0.5.4 gekko==1.1.1 gevent==24.2.1 geventhttpclient==2.2.0 gradio==4.26.0 gradio_client==0.15.1 greenlet==3.0.3 h11==0.14.0 httpcore==1.0.5 httptools==0.6.1 httpx==0.27.0 huggingface-hub==0.22.2 humanfriendly==10.0 idna==3.7 imageio==2.34.0 importlib_metadata==7.1.0 importlib_resources==6.4.0 iniconfig==2.0.0 interegular==0.3.3 itsdangerous==2.2.0 Jinja2==3.1.3 jmespath==0.10.0 joblib==1.4.0 jsonpatch==1.33 jsonpointer==2.4 jsonschema==4.21.1 jsonschema-specifications==2023.12.1 kiwisolver==1.4.5 langchain==0.1.16 langchain-community==0.0.34 langchain-core==0.1.45 langchain-text-splitters==0.0.1 langsmith==0.1.49 lark==1.1.9 lazy_loader==0.4 llama_cpp_python @ file:///home/aigc/iqas/llama_cpp_python-0.2.57-cp310-cp310-manylinux_2_17_x86_64.whl llvmlite==0.42.0 locust==2.25.0 markdown-it-py==3.0.0 MarkupSafe==2.1.5 marshmallow==3.21.1 matplotlib==3.8.4 mdurl==0.1.2 modelscope==1.13.3 mpmath==1.3.0 msgpack==1.0.8 multidict==6.0.5 multiprocess==0.70.16 mypy-extensions==1.0.0 nest-asyncio==1.6.0 networkx==3.3 ninja==1.11.1.1 numba==0.59.1 numpy==1.26.4 nvidia-ml-py==12.550.52 nvidia-nccl-cu12==2.18.1 openai==1.17.1 opencv-contrib-python==4.9.0.80 opencv-python==4.9.0.80 optimum==1.19.0 orjson==3.10.1 oss2==2.18.4 outlines==0.0.34 packaging==23.2 pandas==2.2.2 passlib==1.7.4 peft==0.10.0 pillow==10.3.0 platformdirs==4.2.0 plumbum==1.8.2 prometheus_client==0.20.0 protobuf==5.26.1 psutil==5.9.8 py-cpuinfo==9.0.0 pyarrow==16.0.0 pyarrow-hotfix==0.6 pyasn1==0.6.0 pycparser==2.22 pycryptodome==3.20.0 pydantic==2.7.0 pydantic-settings==2.2.1 pydantic_core==2.18.1 pydub==0.25.1 Pygments==2.17.2 pynvml==11.5.0 pyparsing==3.1.2 python-dateutil==2.9.0.post0 python-dotenv==1.0.1 python-jose==3.3.0 python-multipart==0.0.9 pytz==2024.1 PyYAML==6.0.1 pyzmq==26.0.0 quantile-python==1.1 ray==2.11.0 referencing==0.34.0 regex==2024.4.16 requests==2.31.0 rich==13.7.1 rouge==1.0.1 roundrobin==0.0.4 rpds-py==0.18.0 rpyc==6.0.0 rsa==4.9 ruff==0.3.7 s3fs==2023.10.0 safetensors==0.4.3 scikit-image==0.23.1 scikit-learn==1.4.2 scipy==1.13.0 semantic-version==2.10.0 sentence-transformers==2.7.0 sentencepiece==0.2.0 sglang==0.1.14 shellingham==1.5.4 simplejson==3.19.2 six==1.16.0 sniffio==1.3.1 sortedcontainers==2.4.0 SQLAlchemy==2.0.29 sse-starlette==2.1.0 starlette==0.37.2 starlette-context==0.3.6 sympy==1.12 tabulate==0.9.0 tblib==3.0.0 tenacity==8.2.3 threadpoolctl==3.4.0 tifffile==2024.2.12 tiktoken==0.6.0 timm==0.9.16 tokenizers==0.15.2 tomli==2.0.1 tomlkit==0.12.0 toolz==0.12.1 torch @ file:///home/aigc/iqas/torch-2.1.2%2Bcu121-cp310-cp310-linux_x86_64.whl torchvision @ file:///home/aigc/iqas/torchvision-0.16.2%2Bcu121-cp310-cp310-linux_x86_64.whl tqdm==4.66.2 transformers==4.39.3 transformers-stream-generator==0.0.5 triton==2.1.0 typer==0.11.1 typing-inspect==0.9.0 typing_extensions==4.11.0 tzdata==2024.1 urllib3==2.0.7 uvicorn==0.29.0 uvloop==0.19.0 vllm==0.4.0.post1 watchfiles==0.21.0 wcwidth==0.2.13 websockets==11.0.3 Werkzeug==3.0.2 wrapt==1.16.0 xformers==0.0.23.post1 xinference==0.10.3 xoscar==0.3.0 xxhash==3.4.1 yapf==0.40.2 yarl==1.9.4 zipp==3.18.1 zmq==0.0.0 zope.event==5.0 zope.interface==6.3 zstandard==0.22.0