jina-reranker-v2报错 - Githubissues

leslie2046 commented 3 months ago

System Info / 系統信息

centos 7.9 python 3.10.6 Package Version

absl-py 2.1.0 accelerate 0.33.0 aiobotocore 2.7.0 aiofiles 23.2.1 aiohttp 3.9.5 aioitertools 0.11.0 aioprometheus 23.12.0 aiosignal 1.3.1 alembic 1.13.2 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 autopage 0.5.2 Babel 2.15.0 bcrypt 4.2.0 beautifulsoup4 4.12.3 bibtexparser 2.0.0b7 bleach 6.1.0 botocore 1.31.64 certifi 2024.7.4 cffi 1.16.0 cfgv 3.4.0 charset-normalizer 3.3.2 chattts 0.1.1 click 8.1.7 cliff 4.7.0 clldutils 3.22.2 cloudpickle 3.0.0 cmaes 0.10.0 cmd2 2.4.3 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 comm 0.2.2 conformer 0.3.2 contourpy 1.2.1 crcmod 1.7 cryptography 43.0.0 csvw 3.3.0 cycler 0.12.1 Cython 3.0.10 debugpy 1.8.2 decorator 5.1.1 defusedxml 0.7.1 diffusers 0.25.0 diskcache 5.6.3 distlib 0.3.8 distro 1.9.0 dlinfo 1.2.1 ecdsa 0.19.0 editdistance 0.8.1 einops 0.8.0 einx 0.3.0 encodec 0.1.1 exceptiongroup 1.2.2 executing 2.0.1 fastapi 0.110.3 fastjsonschema 2.20.0 ffmpy 0.3.2 filelock 3.15.4 flatbuffers 24.3.25 fonttools 4.53.1 fqdn 1.5.1 frozendict 2.4.4 frozenlist 1.4.1 fsspec 2023.10.0 funasr 1.1.4 gdown 5.2.0 gradio 4.26.0 gradio_client 0.15.1 greenlet 3.0.3 grpcio 1.65.1 h11 0.14.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.24.2 humanfriendly 10.0 hydra-colorlog 1.2.0 hydra-core 1.3.2 hydra-optuna-sweeper 1.2.0 HyperPyYAML 1.2.2 identify 2.6.0 idna 3.7 importlib_metadata 8.2.0 importlib_resources 6.4.0 inflect 7.3.1 iniconfig 2.0.0 ipykernel 6.29.5 ipython 8.26.0 ipywidgets 8.1.3 isodate 0.6.1 isoduration 20.11.0 jaconv 0.4.0 jamo 0.4.1 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.4 jmespath 0.10.0 joblib 1.4.2 json5 0.9.25 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.2 jupyter_server_terminals 0.5.3 jupyterlab 4.2.4 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.11 kaldiio 2.18.0 kiwisolver 1.4.5 language-tags 1.2.0 lazy_loader 0.4 librosa 0.10.2.post1 lightning 2.3.3 lightning-utilities 0.11.6 llvmlite 0.43.0 lxml 5.2.2 Mako 1.3.5 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matcha-tts 0.0.6.0 matplotlib 3.9.1 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 modelscope 1.16.1 more-itertools 10.3.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.3 nodeenv 1.9.1 notebook 7.2.1 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime-gpu 1.16.0 openai 1.37.1 openai-whisper 20231117 opencv-contrib-python 4.10.0.84 optuna 2.10.1 orjson 3.10.6 oss2 2.18.6 overrides 7.7.0 packaging 24.1 pandas 2.2.2 pandocfilters 1.5.1 parso 0.8.4 passlib 1.7.4 pbr 6.0.0 peft 0.12.0 pexpect 4.9.0 phonemizer 3.2.1 pillow 10.4.0 pip 23.3.1 platformdirs 4.2.2 pluggy 1.5.0 pooch 1.8.2 pre-commit 3.7.1 prettytable 3.10.2 prometheus_client 0.20.0 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 6.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 17.0.0 pyasn1 0.6.0 pybase16384 0.3.7 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.8.2 pydantic_core 2.20.1 pydub 0.25.1 Pygments 2.18.0 pylatexenc 2.10 pynini 2.1.5 pynndescent 0.5.13 pynvml 11.5.3 pyparsing 3.1.2 pyperclip 1.9.0 PySocks 1.7.1 pytest 8.3.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-json-logger 2.0.7 python-multipart 0.0.9 pytorch-lightning 2.3.3 pytorch-wpe 0.0.1 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 quantile-python 1.1 rdflib 7.0.0 referencing 0.35.1 regex 2024.7.24 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986 1.5.0 rfc3986-validator 0.1.1 rich 13.7.1 rootutils 1.0.7 rpds-py 0.19.1 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.5.5 s3fs 2023.10.0 safetensors 0.4.3 scikit-learn 1.5.1 scipy 1.14.0 seaborn 0.13.2 segments 2.2.1 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 3.0.1 sentencepiece 0.2.0 setuptools 68.2.2 shellingham 1.5.4 six 1.16.0 sniffio 1.3.1 soundfile 0.12.1 soupsieve 2.5 soxr 0.4.0 SQLAlchemy 2.0.31 sse-starlette 2.1.2 stack-data 0.6.3 starlette 0.37.2 stevedore 5.2.0 sympy 1.13.1 tabulate 0.9.0 tblib 3.0.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 terminado 0.18.1 threadpoolctl 3.5.0 tiktoken 0.7.0 timm 1.0.7 tinycss2 1.3.0 tn 0.0.4 tokenizers 0.19.1 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.4.0 torch-complex 0.4.4 torchaudio 2.4.0 torchmetrics 1.4.0.post0 torchvision 0.19.0 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.43.3 triton 3.0.0 typeguard 4.3.0 typer 0.11.1 types-python-dateutil 2.9.0.20240316 typing_extensions 4.12.2 tzdata 2024.1 umap-learn 0.5.6 Unidecode 1.3.8 uri-template 1.3.0 uritemplate 4.1.1 urllib3 2.0.7 uvicorn 0.30.3 uvloop 0.19.0 vector-quantize-pytorch 1.15.6 virtualenv 20.26.3 vocos 0.1.0 wcwidth 0.2.13 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 11.0.3 Werkzeug 3.0.3 WeTextProcessing 1.0.3 wget 3.2 wheel 0.41.2 whisper 1.1.10 widgetsnbextension 4.0.11 wrapt 1.16.0 xinference 0.14.1 xinference-client 0.14.1 xoscar 0.3.2 xxhash 3.4.1 yarl 1.9.4 zipp 3.19.2

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece？

[ ] docker / docker
[X] pip install / 通过 pip install 安装
[ ] installation from source / 从源码安装

Version info / 版本信息

0.14.1

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

1.curl -X 'POST' 'http://192.168.1.88:9997/v1/rerank' -H 'accept: application/json' -H 'Content-Type: application/json' -d '{ "model": "jina-reranker-v2", "query": "A man is eating pasta.", "documents": [ "A man is eating food.", "A man is eating a piece of bread.", "The girl is carrying a baby.", "A man is riding a horse.", "A woman is playing violin."] }' -w "\n时间总计: %{time_total} 秒\n"

2.会报错 {"detail":"[address=0.0.0.0:40587, pid=69485] CUDA error: device-side assert triggered\nCUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect.\nFor debugging consider passing CUDA_LAUNCH_BLOCKING=1\nCompile with TORCH_USE_CUDA_DSA to enable device-side assertions.\n"} 时间总计: 0.325 秒

3.日志 ther API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions. Traceback (most recent call last): File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/api/restful_api.py", line 1223, in rerank scores = await model.rerank( File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 231, in send return self._process_result_message(result) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 102, in _process_result_message raise message.as_instanceof_cause() File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/pool.py", line 656, in send result = await self._run_coro(message.message_id, coro) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/pool.py", line 367, in _run_coro return await coro File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/api.py", line 384, in on_receive return await super().on_receive(message) # type: ignore File "xoscar/core.pyx", line 558, in on_receive__ raise ex File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.on_receive async with self._lock: File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.on_receive with debug_async_timeout('actor_lock_timeout', File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive result = await result File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/utils.py", line 45, in wrapped ret = await func(*args, kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 90, in wrapped_func ret = await fn(self, *args, *kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 591, in rerank return await self._call_wrapper_json( File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 398, in _call_wrapper_json return await self._call_wrapper("json", fn, args, kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 114, in _async_wrapper return await fn(*args, kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/model.py", line 415, in _call_wrapper ret = await asyncio.to_thread(fn, *args, *kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/asyncio/threads.py", line 25, in to_thread return await loop.run_in_executor(None, func_call) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/concurrent/futures/thread.py", line 58, in run result = self.fn(self.args, self.kwargs) File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/model/rerank/core.py", line 207, in rerank empty_cache() File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/device_utils.py", line 94, in empty_cache torch.cuda.empty_cache() File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/torch/cuda/memory.py", line 170, in empty_cache torch._C._cuda_emptyCache() RuntimeError: [address=0.0.0.0:43151, pid=69266] CUDA error: device-side assert triggered CUDA kernel errors might be asynchronously reported at some other API call, so the stacktrace below might be incorrect. For debugging consider passing CUDA_LAUNCH_BLOCKING=1 Compile with TORCH_USE_CUDA_DSA to enable device-side assertions.

Expected behavior / 期待表现

正常调用

qinxuye commented 3 months ago

torch.cuda.empty_cache()

这个报错看上去是环境问题。

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 7 days with no activity.