Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
xinference v0.15.4 launch model 时报错RuntimeError: Failed to launch model, detail: [address=0.0.0.0:56145, pid=108] No available slot found for the model #2455
Traceback (most recent call last):
File "/usr/local/bin/xinference", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/xinference/deploy/cmdline.py", line 901, in model_launch
model_uid = client.launch_model(
File "/usr/local/lib/python3.10/dist-packages/xinference/client/restful/restful_client.py", line 940, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:56145, pid=108] No available slot found for the model
System Info / 系統信息
ubuntu 22.04 xinference:v0.15.4 Package Version
accelerate 0.34.0 aiofiles 23.2.1 aiohappyeyeballs 2.4.0 aiohttp 3.10.5 aioprometheus 23.12.0 aiosignal 1.3.1 aliyun-python-sdk-core 2.16.0 aliyun-python-sdk-kms 2.16.5 altair 5.4.1 annotated-types 0.7.0 anthropic 0.36.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 argcomplete 3.5.1 async-timeout 4.0.3 attrdict 2.0.1 attrs 24.2.0 audioread 3.0.1 auto_gptq 0.7.1 autoawq 0.2.5 autoawq_kernels 0.0.6 av 13.1.0 bcrypt 4.2.0 beautifulsoup4 4.12.3 bitsandbytes 0.44.1 black 24.10.0 boto3 1.28.64 botocore 1.31.85 cdifflib 1.2.6 certifi 2019.11.28 cffi 1.17.1 chardet 3.0.4 charset-normalizer 3.3.2 chattts 0.1.1 click 8.1.7 cloudpickle 3.0.0 colorama 0.4.6 coloredlogs 15.0.1 conformer 0.3.2 contourpy 1.3.0 controlnet-aux 0.0.7 crcmod 1.7 cryptography 43.0.1 cycler 0.12.1 Cython 3.0.11 datamodel-code-generator 0.26.1 datasets 2.21.0 dbus-python 1.2.16 decorator 5.1.1 DeepCache 0.1.1 diffusers 0.30.3 dill 0.3.8 diskcache 5.6.3 distro 1.9.0 distro-info 0.23+ubuntu1.1 dnspython 2.7.0 ecdsa 0.19.0 editdistance 0.8.1 einops 0.8.0 einx 0.3.0 email_validator 2.2.0 encodec 0.1.1 eva-decord 0.6.1 exceptiongroup 1.2.2 fastapi 0.110.3 ffmpy 0.4.0 filelock 3.15.4 FlagEmbedding 1.2.11 flashinfer 0.1.6+cu124torch2.4 flatbuffers 24.3.25 fonttools 4.54.1 frozendict 2.4.5 frozenlist 1.4.1 fsspec 2024.6.1 funasr 1.1.12 fvcore 0.1.5.post20221221 gdown 5.2.0 gekko 1.2.1 genson 1.3.0 gguf 0.9.1 gradio 4.26.0 gradio_client 0.15.1 h11 0.14.0 hf_transfer 0.1.8 hiredis 3.0.0 httpcore 1.0.5 httptools 0.6.1 httpx 0.27.2 huggingface-hub 0.24.6 humanfriendly 10.0 hydra-core 1.3.2 HyperPyYAML 1.2.2 idna 2.8 imageio 2.35.1 imageio-ffmpeg 0.5.1 importlib_metadata 8.4.0 importlib_resources 6.4.5 inflect 5.6.2 interegular 0.3.3 iopath 0.1.10 isort 5.13.2 jaconv 0.4.0 jamo 0.4.1 jieba 0.42.1 Jinja2 3.1.4 jiter 0.5.0 jj-pytorchvideo 0.1.5 jmespath 0.10.0 joblib 1.4.2 jsonschema 4.23.0 jsonschema-specifications 2023.12.1 kaldiio 2.18.0 kiwisolver 1.4.7 lark 1.2.2 lazy_loader 0.4 libnacl 2.1.0 librosa 0.10.2.post1 lightning 2.4.0 lightning-utilities 0.11.7 litellm 1.49.1 llama_cpp_python 0.2.90 llvmlite 0.43.0 lm-format-enforcer 0.10.6 loguru 0.7.2 loralib 0.1.2 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matplotlib 3.9.2 mdurl 0.1.2 mistral_common 1.3.4 modelscope 1.17.1 mpmath 1.3.0 msgpack 1.0.8 msgspec 0.18.6 multidict 6.0.5 multiprocess 0.70.16 mypy-extensions 1.0.0 narwhals 1.9.3 natsort 8.4.0 nemo_text_processing 1.0.2 nest-asyncio 1.6.0 networkx 3.3 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-ml-py 12.560.30 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.6.68 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime-gpu 1.16.0 openai 1.51.2 opencv-contrib-python-headless 4.10.0.84 opencv-python 4.10.0.84 optimum 1.23.1 orjson 3.10.7 ormsgpack 1.5.0 oss2 2.19.0 outlines 0.0.46 packaging 24.1 pandas 2.2.2 parameterized 0.9.0 partial-json-parser 0.2.1.1.post4 passlib 1.7.4 pathspec 0.12.1 peft 0.13.2 pillow 10.4.0 pip 24.2 platformdirs 4.3.6 plumbum 1.9.0 pooch 1.8.2 portalocker 2.10.1 prometheus_client 0.20.0 prometheus-fastapi-instrumentator 7.0.0 protobuf 5.28.0 psutil 6.0.0 py-cpuinfo 9.0.0 pyairports 2.1.1 pyarrow 17.0.0 pyasn1 0.6.1 pybase16384 0.3.7 pycountry 24.6.1 pycparser 2.22 pycryptodome 3.21.0 pydantic 2.8.2 pydantic_core 2.20.1 pydub 0.25.1 Pygments 2.18.0 PyGObject 3.36.0 pynini 2.1.5 pynndescent 0.5.13 pyparsing 3.1.4 PySocks 1.7.1 python-apt 2.0.1+ubuntu0.20.4.1 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-multipart 0.0.12 pytorch-lightning 2.4.0 pytorch-wpe 0.0.1 pytz 2024.1 PyYAML 6.0.2 pyzmq 26.2.0 quantile-python 1.1 qwen-vl-utils 0.0.8 ray 2.35.0 redis 5.1.1 referencing 0.35.1 regex 2024.7.24 requests 2.32.3 requests-unixsocket 0.2.0 rich 13.9.2 rouge 1.0.1 rpds-py 0.20.0 rpyc 6.0.1 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.6.9 s3transfer 0.7.0 sacremoses 0.1.1 safetensors 0.4.4 scikit-image 0.24.0 scikit-learn 1.5.2 scipy 1.14.1 semantic-version 2.10.0 sentence-transformers 3.1.0 sentencepiece 0.2.0 setuptools 75.1.0 sglang 0.3.3.post1 shellingham 1.5.4 six 1.14.0 sniffio 1.3.1 soundfile 0.12.1 soupsieve 2.6 soxr 0.5.0.post1 sse-starlette 2.1.3 starlette 0.37.2 sympy 1.13.2 tabulate 0.9.0 tblib 3.0.0 tensorboardX 2.6.2.2 tensorizer 2.9.0 termcolor 2.5.0 threadpoolctl 3.5.0 tifffile 2024.9.20 tiktoken 0.7.0 timm 1.0.9 tokenizers 0.19.1 toml 0.10.2 tomli 2.0.2 tomlkit 0.12.0 torch 2.4.0 torch-complex 0.4.4 torchaudio 2.4.0 torchmetrics 1.4.3 torchvision 0.19.0 tqdm 4.66.5 transformers 4.44.2 transformers-stream-generator 0.0.5 triton 3.0.0 typer 0.11.1 typing_extensions 4.12.2 tzdata 2024.1 umap-learn 0.5.6 unattended-upgrades 0.1 urllib3 2.0.7 uvicorn 0.30.6 uvloop 0.20.0 vector-quantize-pytorch 1.18.1 vllm 0.6.0 vllm-flash-attn 2.6.1 vocos 0.1.0 watchfiles 0.24.0 websockets 11.0.3 WeTextProcessing 1.0.3 wget 3.2 wheel 0.34.2 wrapt 1.16.0 xformers 0.0.27.post2 xinference 0.15.4 xoscar 0.3.3 xxhash 3.5.0 yacs 0.1.8 yarl 1.9.9 zipp 3.20.1 zmq 0.0.0 zstandard 0.23.0
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
xinference, version 0.15.4
The command used to start Xinference / 用以启动 xinference 的命令
docker run -d --name xinference -e XINFERENCE_MODEL_SRC=modelscope -e HF_ENDPOINT=https://hf-mirror.com -p 9998:9997 --gpus device=1 --shm-size=128g xprobe/xinference xinference-local -H 0.0.0.0 --log-level debug
xinference launch --model-name bge-reranker-v2-m3 --model-type rerank xinference launch --model-name ChatTTS --model-type audio
Reproduction / 复现过程
1.尝试过换就版本v0.15.3和v0.15.2都不行 2.尝试过将sentence-transformers降到3.1.0和3.1.1也都不行
xinference launch --model-name bge-reranker-v2-m3 --model-type rerank
Traceback (most recent call last): File "/usr/local/bin/xinference", line 8, in
sys.exit(cli())
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1157, in call
return self.main(args, kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1078, in main
rv = self.invoke(ctx)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1688, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 1434, in invoke
return ctx.invoke(self.callback, ctx.params)
File "/usr/local/lib/python3.10/dist-packages/click/core.py", line 783, in invoke
return __callback(args, *kwargs)
File "/usr/local/lib/python3.10/dist-packages/click/decorators.py", line 33, in new_func
return f(get_current_context(), args, **kwargs)
File "/usr/local/lib/python3.10/dist-packages/xinference/deploy/cmdline.py", line 901, in model_launch
model_uid = client.launch_model(
File "/usr/local/lib/python3.10/dist-packages/xinference/client/restful/restful_client.py", line 940, in launch_model
raise RuntimeError(
RuntimeError: Failed to launch model, detail: [address=0.0.0.0:56145, pid=108] No available slot found for the model
Expected behavior / 期待表现
可以成功launch