xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
4.82k stars 381 forks source link

新版本似乎不能部署bge-m3,报错了 #1950

Closed leslie2046 closed 1 month ago

leslie2046 commented 1 month ago

System Info / 系統信息

centos 7.9 python3.10.6 pip3 list Package Version


absl-py 2.1.0 accelerate 0.28.0 addict 2.4.0 aiobotocore 2.7.0 aiofiles 23.2.1 aiohttp 3.9.3 aioitertools 0.11.0 aioprometheus 23.12.0 aiosignal 1.3.1 alembic 1.13.2 aliyun-python-sdk-core 2.15.0 aliyun-python-sdk-kms 2.16.2 altair 5.2.0 annotated-types 0.6.0 anthropic 0.25.6 antlr4-python3-runtime 4.9.3 anyio 4.3.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrdict 2.0.1 attrs 23.2.0 audioread 3.0.1 auto_gptq 0.7.1 autoawq 0.2.5 autoawq_kernels 0.0.6 autopage 0.5.2 av 11.0.0 Babel 2.15.0 bcrypt 4.1.2 beautifulsoup4 4.12.3 bibtexparser 2.0.0b7 bitsandbytes 0.42.0 bleach 6.1.0 boto3 1.28.64 botocore 1.31.64 build 1.2.1 cdifflib 1.2.6 certifi 2024.2.2 cffi 1.16.0 cfgv 3.4.0 charset-normalizer 3.3.2 chatglm-cpp 0.3.2 chattts 0.1.1 click 8.1.7 cliff 4.7.0 clldutils 3.22.2 cloudpickle 3.0.0 cmaes 0.10.0 cmd2 2.4.3 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 comm 0.2.2 conformer 0.3.2 contourpy 1.2.0 controlnet-aux 0.0.7 crcmod 1.7 cryptography 42.0.5 csvw 3.3.0 ctransformers 0.2.27 ctranslate2 4.1.0 cupy-cuda12x 12.1.0 cycler 0.12.1 Cython 3.0.10 datasets 2.18.0 debugpy 1.8.2 decorator 5.1.1 defusedxml 0.7.1 diffusers 0.21.3 dill 0.3.8 diskcache 5.6.3 distlib 0.3.8 distro 1.9.0 dlinfo 1.2.1 ecdsa 0.18.0 editdistance 0.8.1 einops 0.8.0 einx 0.3.0 encodec 0.1.1 exceptiongroup 1.2.0 executing 2.0.1 fastapi 0.110.3 faster-whisper 1.0.1 fastjsonschema 2.20.0 fastrlock 0.8.2 ffmpeg-python 0.2.0 ffmpy 0.3.2 filelock 3.13.1 FlagEmbedding 1.2.9 flatbuffers 24.3.25 fonttools 4.50.0 fqdn 1.5.1 frozendict 2.4.4 frozenlist 1.4.1 fsspec 2023.10.0 future 1.0.0 gast 0.5.4 gdown 5.2.0 gekko 1.0.7 gradio 4.26.0 gradio_client 0.15.1 greenlet 3.0.3 grpcio 1.65.1 h11 0.14.0 hiredis 3.0.0 httpcore 1.0.4 httptools 0.6.1 httpx 0.27.0 huggingface-hub 0.24.2 humanfriendly 10.0 hydra-colorlog 1.2.0 hydra-core 1.3.2 hydra-optuna-sweeper 1.2.0 HyperPyYAML 1.2.2 identify 2.6.0 idna 3.6 imageio 2.34.1 importlib_metadata 7.1.0 importlib_resources 6.4.0 inflect 7.2.1 iniconfig 2.0.0 interegular 0.3.3 ipykernel 6.29.5 ipython 8.25.0 ipywidgets 8.1.3 isodate 0.6.1 isoduration 20.11.0 jedi 0.19.1 Jinja2 3.1.3 jmespath 0.10.0 joblib 1.3.2 json5 0.9.25 jsonpointer 3.0.0 jsonschema 4.21.1 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.2 jupyter_server_terminals 0.5.3 jupyterlab 4.2.4 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.11 kiwisolver 1.4.5 language-tags 1.2.0 lark 1.1.9 lazy_loader 0.4 libnacl 2.1.0 librosa 0.10.1 lightning 2.3.3 lightning-utilities 0.11.6 llama_cpp_python 0.2.65 llvmlite 0.42.0 lxml 5.2.2 Mako 1.3.5 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matcha-tts 0.0.4 matplotlib 3.8.3 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 modelscope 1.13.2 more-itertools 10.3.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 multiprocess 0.70.16 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nemo_text_processing 1.0.2 nest-asyncio 1.6.0 networkx 3.2.1 ninja 1.11.1.1 nodeenv 1.9.1 notebook 7.2.1 notebook_shim 0.2.4 numba 0.59.1 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 8.9.2.26 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.18.1 nvidia-nvjitlink-cu12 12.4.99 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime 1.16.0 openai 1.14.2 openai-whisper 20231117 opencv-contrib-python 4.9.0.80 opencv-python 4.9.0.80 optimum 1.17.1 optuna 2.10.1 orjson 3.9.15 oss2 2.18.4 outlines 0.0.34 overrides 7.7.0 packaging 24.0 pandas 2.2.1 pandocfilters 1.5.1 parso 0.8.4 passlib 1.7.4 pbr 6.0.0 peft 0.10.0 pexpect 4.9.0 phonemizer 3.2.1 pillow 10.2.0 pip 23.3.1 platformdirs 4.2.0 pluggy 1.5.0 plumbum 1.8.2 pooch 1.8.1 pre-commit 3.7.1 prettytable 3.10.2 prometheus_client 0.20.0 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 5.9.8 ptyprocess 0.7.0 pure-eval 0.2.2 py-cpuinfo 9.0.0 pyarrow 15.0.2 pyarrow-hotfix 0.6 pyasn1 0.5.1 pybase16384 0.3.7 pycparser 2.21 pycryptodome 3.20.0 pydantic 2.6.4 pydantic_core 2.16.3 pydub 0.25.1 Pygments 2.17.2 pylatexenc 2.10 pynini 2.1.5 pynvml 11.5.0 pyparsing 3.1.2 pyperclip 1.9.0 pyproject_hooks 1.1.0 PySocks 1.7.1 pytest 8.3.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-json-logger 2.0.7 python-multipart 0.0.9 pytorch-lightning 2.3.3 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.2 quantile-python 1.1 ray 2.10.0 rdflib 7.0.0 redis 5.0.7 referencing 0.34.0 regex 2023.12.25 requests 2.31.0 rfc3339-validator 0.1.4 rfc3986 1.5.0 rfc3986-validator 0.1.1 rich 13.7.1 rootutils 1.0.7 rouge 1.0.1 rpds-py 0.18.0 rpyc 6.0.0 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.3.4 s3fs 2023.10.0 s3transfer 0.7.0 sacremoses 0.1.1 safetensors 0.4.2 scikit-image 0.23.2 scikit-learn 1.4.1.post1 scipy 1.12.0 seaborn 0.13.2 segments 2.2.1 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 3.0.1 sentencepiece 0.2.0 setuptools 68.2.2 sglang 0.1.14 shellingham 1.5.4 simplejson 3.19.2 six 1.16.0 sniffio 1.3.1 sortedcontainers 2.4.0 soundfile 0.12.1 soupsieve 2.5 soxr 0.3.7 SQLAlchemy 2.0.31 sse-starlette 2.0.0 stack-data 0.6.3 starlette 0.37.2 stevedore 5.2.0 sympy 1.12 tabulate 0.9.0 tblib 3.0.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorizer 2.9.0 terminado 0.18.1 threadpoolctl 3.4.0 tifffile 2024.4.24 tiktoken 0.6.0 timm 0.9.16 tinycss2 1.3.0 tokenizers 0.19.1 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.1.2 torchaudio 2.1.2 torchmetrics 1.4.0.post0 torchvision 0.16.2 tornado 6.4.1 tqdm 4.66.2 traitlets 5.14.3 transformers 4.43.2 transformers-stream-generator 0.0.5 triton 2.1.0 typeguard 4.3.0 typer 0.10.0 types-python-dateutil 2.9.0.20240316 typing_extensions 4.10.0 tzdata 2024.1 Unidecode 1.3.8 uri-template 1.3.0 uritemplate 4.1.1 urllib3 2.0.7 uvicorn 0.29.0 uvloop 0.19.0 vector-quantize-pytorch 1.14.24 virtualenv 20.26.3 vllm 0.3.3 vocos 0.1.0 watchfiles 0.21.0 wcwidth 0.2.13 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 11.0.3 Werkzeug 3.0.3 WeTextProcessing 1.0.1 wget 3.2 wheel 0.41.2 widgetsnbextension 4.0.11 wrapt 1.16.0 xformers 0.0.23.post1 xinference 0.13.3 xinference-client 0.13.3 xoscar 0.3.0 xxhash 3.4.1 yapf 0.40.2 yarl 1.9.4 zipp 3.18.1 zmq 0.0.0 zstandard 0.22.0

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.13.3

The command used to start Xinference / 用以启动 xinference 的命令

xinference launch --model-name bge-m3 --model-type embedding -r 4 --n-gpu 2 Launch model name: bge-m3 with kwargs: {} Traceback (most recent call last): File "/home/njue/anaconda3/envs/xinference/bin/xinference", line 8, in sys.exit(cli()) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1157, in call return self.main(args, kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, ctx.params) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(args, *kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/click/decorators.py", line 33, in new_func return f(get_current_context(), args, **kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 903, in model_launch model_uid = client.launch_model( File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1041, in launch_model raise RuntimeError( RuntimeError: Failed to launch model, detail: [address=0.0.0.0:40232, pid=138956] Failed to import transformers.trainer because of the following error (look up to see its traceback): cannot import name 'is_mlu_available' from 'accelerate.utils' (/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/accelerate/utils/init.py)

Reproduction / 复现过程

xinference launch --model-name bge-m3 --model-type embedding -r 4 --n-gpu 2

Expected behavior / 期待表现

正常启动bge-m3

lhs0627 commented 1 month ago

怎么样了,解决了吗

leslie2046 commented 1 month ago

@lhs0627 是的解决了

lhs0627 commented 1 month ago

@leslie2046 我使用xinference launch --model-name bge-m3 --model-type embedding这个命令就一直报错,RuntimeError: Failed to launch model, detail: [address=127.0.0.1:60909, pid=4556] Failed to download model 'bge-m3' after multiple retries,请问你知道该如何解决吗?

leslie2046 commented 1 month ago

@lhs0627 因为你没有使用代理去下载模型,或者你要设置下环境变量HF_ENDPOINT=https://hf-mirror.com