xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.05k stars 402 forks source link

昇腾910上按照教程安装xinference后,提示 Failed to import module 'SentenceTransformer' #2002

Closed LightingFx closed 1 month ago

LightingFx commented 2 months ago

硬件信息:昇腾910 Ascend CANN version=23.0.rc2 python: 3.10 transformers:4.43.3 xinference:0.13.3 sentence-transformers:3.0.1 启动方式:xinference-local --host 0.0.0.0 --port 9997 加载模型:xinference launch --model-engine transformers --model-name bge-base-zh-v1.5 --model-type embedding

按照官方提供的在 昇腾 NPU 上安装 教程安装完成后,提示 ImportError: [address=0.0.0.0:45409, pid=484951] Failed to import module 'SentenceTransformer'。 但实际环境中已经安装sentence-transformers,原因是否和昇腾版安装有关系?应该怎样解决呢?

具体报错信息:

2024-08-02 03:32:53,889 xinference.model.utils 484276 INFO     Use model cache from a different hub.
2024-08-02 03:32:54,387 xinference.core.worker 484276 ERROR    Failed to load model bge-base-zh-v1.5-1-0
Traceback (most recent call last):
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/model/embedding/core.py", line 130, in load
    from sentence_transformers import SentenceTransformer
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/__init__.py", line 7, in <module>
    from sentence_transformers.cross_encoder.CrossEncoder import CrossEncoder
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/cross_encoder/__init__.py", line 1, in <module>
    from .CrossEncoder import CrossEncoder
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py", line 16, in <module>
    from sentence_transformers.evaluation.SentenceEvaluator import SentenceEvaluator
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/evaluation/__init__.py", line 1, in <module>
    from .BinaryClassificationEvaluator import BinaryClassificationEvaluator
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py", line 8, in <module>
    from sklearn.metrics import average_precision_score
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/__init__.py", line 85, in <module>
    from .utils._show_versions import show_versions
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/utils/_show_versions.py", line 15, in <module>
    from ._openmp_helpers import _openmp_parallelism_enabled
ImportError: /home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/utils/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/worker.py", line 841, in launch_builtin_model
    await model_ref.load()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/model.py", line 295, in load
    self._model.load()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/model/embedding/core.py", line 138, in load
    raise ImportError(f"{error_message}\n\n{''.join(installation_guide)}")
ImportError: [address=0.0.0.0:45409, pid=484951] Failed to import module 'SentenceTransformer'

Please make sure 'sentence-transformers' is installed. You can install it by `pip install sentence-transformers`

2024-08-02 03:32:54,464 xinference.api.restful_api 483847 ERROR    [address=0.0.0.0:45409, pid=484951] Failed to import module 'SentenceTransformer'

Please make sure 'sentence-transformers' is installed. You can install it by `pip install sentence-transformers`
Traceback (most recent call last):
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/model/embedding/core.py", line 130, in load
    from sentence_transformers import SentenceTransformer
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/__init__.py", line 7, in <module>
    from sentence_transformers.cross_encoder.CrossEncoder import CrossEncoder
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/cross_encoder/__init__.py", line 1, in <module>
    from .CrossEncoder import CrossEncoder
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/cross_encoder/CrossEncoder.py", line 16, in <module>
    from sentence_transformers.evaluation.SentenceEvaluator import SentenceEvaluator
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/evaluation/__init__.py", line 1, in <module>
    from .BinaryClassificationEvaluator import BinaryClassificationEvaluator
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sentence_transformers/evaluation/BinaryClassificationEvaluator.py", line 8, in <module>
    from sklearn.metrics import average_precision_score
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/__init__.py", line 85, in <module>
    from .utils._show_versions import show_versions
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/utils/_show_versions.py", line 15, in <module>
    from ._openmp_helpers import _openmp_parallelism_enabled
ImportError: /home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/utils/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/api/restful_api.py", line 848, in launch_model
    model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/supervisor.py", line 988, in launch_builtin_model
    await _launch_model()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/supervisor.py", line 952, in _launch_model
    await _launch_one_model(rep_model_uid)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/supervisor.py", line 932, in _launch_one_model
    await worker_ref.launch_builtin_model(
  File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
    async with lock:
  File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
    result = await result
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped
    ret = await func(*args, **kwargs)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/worker.py", line 841, in launch_builtin_model
    await model_ref.load()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send
    return self._process_result_message(result)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
    raise message.as_instanceof_cause()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 656, in send
    result = await self._run_coro(message.message_id, coro)
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/backends/pool.py", line 367, in _run_coro
    return await coro
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xoscar/api.py", line 384, in __on_receive__
    return await super().__on_receive__(message)  # type: ignore
  File "xoscar/core.pyx", line 558, in __on_receive__
    raise ex
  File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
    async with self._lock:
  File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
    with debug_async_timeout('actor_lock_timeout',
  File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
    result = await result
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/core/model.py", line 295, in load
    self._model.load()
  File "/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/xinference/model/embedding/core.py", line 138, in load
    raise ImportError(f"{error_message}\n\n{''.join(installation_guide)}")
ImportError: [address=0.0.0.0:45409, pid=484951] Failed to import module 'SentenceTransformer'

Please make sure 'sentence-transformers' is installed. You can install it by `pip install sentence-transformers`

环境信息:

(xinfer) chenfx@altas:~/xinfer$ pip list
Package                   Version
------------------------- -----------
accelerate                0.33.0
aiobotocore               2.7.0
aiofiles                  23.2.1
aiohappyeyeballs          2.3.4
aiohttp                   3.10.0
aioitertools              0.11.0
aioprometheus             23.12.0
aiosignal                 1.3.1
altair                    5.3.0
annotated-types           0.7.0
anyio                     4.4.0
ascendctools              0.1.0
async-timeout             4.0.3
attrs                     23.2.0
auto-tune                 0.1.0
bcrypt                    4.2.0
botocore                  1.31.64
certifi                   2022.12.7
cffi                      1.16.0
charset-normalizer        2.1.1
click                     8.1.7
cloudpickle               3.0.0
colorama                  0.4.6
contourpy                 1.2.1
cryptography              43.0.0
cycler                    0.12.1
dataflow                  0.0.1
decorator                 5.1.1
distro                    1.9.0
ecdsa                     0.19.0
exceptiongroup            1.2.2
fastapi                   0.110.3
ffmpy                     0.4.0
filelock                  3.13.1
fonttools                 4.53.1
frozenlist                1.4.1
fsspec                    2023.10.0
gradio                    4.26.0
gradio_client             0.15.1
h11                       0.14.0
hccl                      0.1.0
hccl-parser               0.1
httpcore                  1.0.5
httpx                     0.27.0
huggingface-hub           0.24.5
idna                      3.4
importlib_resources       6.4.0
Jinja2                    3.1.3
jmespath                  1.0.1
joblib                    1.4.2
jsonschema                4.23.0
jsonschema-specifications 2023.12.1
kiwisolver                1.4.5
markdown-it-py            3.0.0
MarkupSafe                2.1.5
matplotlib                3.9.1
mdurl                     0.1.2
modelscope                1.17.0
mpmath                    1.3.0
msadvisor                 1.0.0
multidict                 6.0.5
networkx                  3.2.1
numpy                     1.26.3
op-compile-tool           0.1.0
op-gen                    0.1
op-test-frame             0.1
opc-tool                  0.1.0
openai                    1.37.1
opencv-contrib-python     4.10.0.84
orjson                    3.10.6
packaging                 24.1
pandas                    2.2.2
passlib                   1.7.4
peft                      0.12.0
pillow                    10.2.0
pip                       24.0
psutil                    6.0.0
pyasn1                    0.6.0
pycparser                 2.22
pydantic                  2.8.2
pydantic_core             2.20.1
pydub                     0.25.1
Pygments                  2.18.0
pynvml                    11.5.3
pyparsing                 3.1.2
python-dateutil           2.9.0.post0
python-jose               3.3.0
python-multipart          0.0.9
pytz                      2024.1
PyYAML                    6.0.1
quantile-python           1.1
referencing               0.35.1
regex                     2024.7.24
requests                  2.28.1
rich                      13.7.1
rpds-py                   0.19.1
rsa                       4.9
ruff                      0.5.5
s3fs                      2023.10.0
safetensors               0.4.3
schedule-search           0.0.1
scikit-learn              1.5.1
scipy                     1.14.0
semantic-version          2.10.0
sentence-transformers     3.0.1
setuptools                69.5.1
shellingham               1.5.4
six                       1.16.0
sniffio                   1.3.1
sse-starlette             2.1.3
starlette                 0.37.2
sympy                     1.12
tabulate                  0.9.0
tblib                     3.0.0
te                        0.4.0
threadpoolctl             3.5.0
timm                      1.0.8
tokenizers                0.19.1
tomlkit                   0.12.0
toolz                     0.12.1
torch                     2.1.0
torch-npu                 2.1.0.post3
torchvision               0.16.0
tornado                   6.4.1
tqdm                      4.66.4
transformers              4.43.3
typer                     0.11.1
typing_extensions         4.9.0
tzdata                    2024.1
urllib3                   1.26.13
uvicorn                   0.30.4
uvloop                    0.19.0
websockets                11.0.3
wheel                     0.43.0
wrapt                     1.16.0
xinference                0.13.3
xoscar                    0.3.2
yarl                      1.9.4
qinxuye commented 2 months ago

看上去 import sentence_transfoermers 的时候 sklearn 有问题

ImportError: /home/chenfx/miniconda3/envs/xinfer/lib/python3.10/site-packages/sklearn/utils/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0: cannot allocate memory in static TLS block

请检查下环境。

zlpmetyou commented 1 month ago

export LD_PRELOAD=$LD_PRELOAD:/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/sitepackages/sklearn/utils/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0

LightingFx commented 1 month ago

export LD_PRELOAD=$LD_PRELOAD:/home/chenfx/miniconda3/envs/xinfer/lib/python3.10/sitepackages/sklearn/utils/../../scikit_learn.libs/libgomp-d22c30c5.so.1.0.0

感谢,加载了环境变量后已经正常

zlpmetyou commented 1 month ago

@LightingFx 部署之后有调用npu么,我部署启动embedding模型,向量化过程特别慢,应该是没调用npu

LightingFx commented 1 month ago

@LightingFx 部署之后有调用npu么,我部署启动embedding模型,向量化过程特别慢,应该是没调用npu

我也是部署embedding模型,刚开始向量化确实比较慢,我top和npu-smi都看了,npu有占用也有在使用

zlpmetyou commented 1 month ago

我这一直都很慢,我试图用--gpu-idx指定npu, 但是没用

zlpmetyou commented 1 month ago

好像只有启动llm指定Transformers 引擎才能调用npu

qinxuye commented 1 month ago

embedding 会调用的,但是我们试下来确实很慢。

我们企业版提供了加速方案,开源版本暂时还没有加速方案。

windcandle commented 21 hours ago

昇腾服务器用xinference没用,特别慢不说,还一堆问题。 官方说推理引擎还是用MindIE(支持310P,910P)和Ascend vllm(仅支持910P)

qinxuye commented 17 hours ago

开源版本会比较慢,昇腾上推荐使用 Xinference 企业版。