xorbitsai / inference

Replace OpenAI GPT with another LLM in your app by changing a single line of code. Xinference gives you the freedom to use any LLM you need. With Xinference, you're empowered to run inference with any open-source language models, speech recognition models, and multimodal models, whether in the cloud, on-premises, or even on your laptop.
https://inference.readthedocs.io
Apache License 2.0
5.49k stars 446 forks source link

SenseVoiceSmall似乎不能识别wav文件 #2061

Closed leslie2046 closed 3 months ago

leslie2046 commented 3 months ago

System Info / 系統信息

centos 7.9 python 3.10.6 Package Version

absl-py 2.1.0 accelerate 0.33.0 aiobotocore 2.7.0 aiofiles 23.2.1 aiohttp 3.9.5 aioitertools 0.11.0 aioprometheus 23.12.0 aiosignal 1.3.1 alembic 1.13.2 aliyun-python-sdk-core 2.15.1 aliyun-python-sdk-kms 2.16.3 altair 5.3.0 annotated-types 0.7.0 antlr4-python3-runtime 4.9.3 anyio 4.4.0 argon2-cffi 23.1.0 argon2-cffi-bindings 21.2.0 arrow 1.3.0 asttokens 2.4.1 async-lru 2.0.4 async-timeout 4.0.3 attrs 23.2.0 audioread 3.0.1 autopage 0.5.2 Babel 2.15.0 bcrypt 4.2.0 beautifulsoup4 4.12.3 bibtexparser 2.0.0b7 bleach 6.1.0 botocore 1.31.64 certifi 2024.7.4 cffi 1.16.0 cfgv 3.4.0 charset-normalizer 3.3.2 chattts 0.1.1 click 8.1.7 cliff 4.7.0 clldutils 3.22.2 cloudpickle 3.0.0 cmaes 0.10.0 cmd2 2.4.3 colorama 0.4.6 coloredlogs 15.0.1 colorlog 6.8.2 comm 0.2.2 conformer 0.3.2 contourpy 1.2.1 crcmod 1.7 cryptography 43.0.0 csvw 3.3.0 cycler 0.12.1 Cython 3.0.10 debugpy 1.8.2 decorator 5.1.1 defusedxml 0.7.1 diffusers 0.25.0 diskcache 5.6.3 distlib 0.3.8 distro 1.9.0 dlinfo 1.2.1 ecdsa 0.19.0 editdistance 0.8.1 einops 0.8.0 einx 0.3.0 encodec 0.1.1 exceptiongroup 1.2.2 executing 2.0.1 fastapi 0.110.3 fastjsonschema 2.20.0 ffmpy 0.3.2 filelock 3.15.4 flatbuffers 24.3.25 fonttools 4.53.1 fqdn 1.5.1 frozendict 2.4.4 frozenlist 1.4.1 fsspec 2023.10.0 funasr 1.1.4 gdown 5.2.0 gradio 4.26.0 gradio_client 0.15.1 greenlet 3.0.3 grpcio 1.65.1 h11 0.14.0 httpcore 1.0.5 httpx 0.27.0 huggingface-hub 0.24.2 humanfriendly 10.0 hydra-colorlog 1.2.0 hydra-core 1.3.2 hydra-optuna-sweeper 1.2.0 HyperPyYAML 1.2.2 identify 2.6.0 idna 3.7 importlib_metadata 8.2.0 importlib_resources 6.4.0 inflect 7.3.1 iniconfig 2.0.0 ipykernel 6.29.5 ipython 8.26.0 ipywidgets 8.1.3 isodate 0.6.1 isoduration 20.11.0 jaconv 0.4.0 jamo 0.4.1 jedi 0.19.1 jieba 0.42.1 Jinja2 3.1.4 jmespath 0.10.0 joblib 1.4.2 json5 0.9.25 jsonpointer 3.0.0 jsonschema 4.23.0 jsonschema-specifications 2023.12.1 jupyter_client 8.6.2 jupyter_core 5.7.2 jupyter-events 0.10.0 jupyter-lsp 2.2.5 jupyter_server 2.14.2 jupyter_server_terminals 0.5.3 jupyterlab 4.2.4 jupyterlab_pygments 0.3.0 jupyterlab_server 2.27.3 jupyterlab_widgets 3.0.11 kaldiio 2.18.0 kiwisolver 1.4.5 language-tags 1.2.0 lazy_loader 0.4 librosa 0.10.2.post1 lightning 2.3.3 lightning-utilities 0.11.6 llvmlite 0.43.0 lxml 5.2.2 Mako 1.3.5 Markdown 3.6 markdown-it-py 3.0.0 MarkupSafe 2.1.5 matcha-tts 0.0.6.0 matplotlib 3.9.1 matplotlib-inline 0.1.7 mdurl 0.1.2 mistune 3.0.2 modelscope 1.16.1 more-itertools 10.3.0 mpmath 1.3.0 msgpack 1.0.8 multidict 6.0.5 nbclient 0.10.0 nbconvert 7.16.4 nbformat 5.10.4 nest-asyncio 1.6.0 networkx 3.3 nodeenv 1.9.1 notebook 7.2.1 notebook_shim 0.2.4 numba 0.60.0 numpy 1.26.4 nvidia-cublas-cu12 12.1.3.1 nvidia-cuda-cupti-cu12 12.1.105 nvidia-cuda-nvrtc-cu12 12.1.105 nvidia-cuda-runtime-cu12 12.1.105 nvidia-cudnn-cu12 9.1.0.70 nvidia-cufft-cu12 11.0.2.54 nvidia-curand-cu12 10.3.2.106 nvidia-cusolver-cu12 11.4.5.107 nvidia-cusparse-cu12 12.1.0.106 nvidia-nccl-cu12 2.20.5 nvidia-nvjitlink-cu12 12.5.82 nvidia-nvtx-cu12 12.1.105 omegaconf 2.3.0 onnxruntime-gpu 1.16.0 openai 1.37.1 openai-whisper 20231117 opencv-contrib-python 4.10.0.84 optuna 2.10.1 orjson 3.10.6 oss2 2.18.6 overrides 7.7.0 packaging 24.1 pandas 2.2.2 pandocfilters 1.5.1 parso 0.8.4 passlib 1.7.4 pbr 6.0.0 peft 0.12.0 pexpect 4.9.0 phonemizer 3.2.1 pillow 10.4.0 pip 23.3.1 platformdirs 4.2.2 pluggy 1.5.0 pooch 1.8.2 pre-commit 3.7.1 prettytable 3.10.2 prometheus_client 0.20.0 prompt_toolkit 3.0.47 protobuf 4.25.4 psutil 6.0.0 ptyprocess 0.7.0 pure_eval 0.2.3 pyarrow 17.0.0 pyasn1 0.6.0 pybase16384 0.3.7 pycparser 2.22 pycryptodome 3.20.0 pydantic 2.8.2 pydantic_core 2.20.1 pydub 0.25.1 Pygments 2.18.0 pylatexenc 2.10 pynini 2.1.5 pynndescent 0.5.13 pynvml 11.5.3 pyparsing 3.1.2 pyperclip 1.9.0 PySocks 1.7.1 pytest 8.3.2 python-dateutil 2.9.0.post0 python-dotenv 1.0.1 python-jose 3.3.0 python-json-logger 2.0.7 python-multipart 0.0.9 pytorch-lightning 2.3.3 pytorch-wpe 0.0.1 pytz 2024.1 PyYAML 6.0.1 pyzmq 26.0.3 quantile-python 1.1 rdflib 7.0.0 referencing 0.35.1 regex 2024.7.24 requests 2.32.3 rfc3339-validator 0.1.4 rfc3986 1.5.0 rfc3986-validator 0.1.1 rich 13.7.1 rootutils 1.0.7 rpds-py 0.19.1 rsa 4.9 ruamel.yaml 0.18.6 ruamel.yaml.clib 0.2.8 ruff 0.5.5 s3fs 2023.10.0 safetensors 0.4.3 scikit-learn 1.5.1 scipy 1.14.0 seaborn 0.13.2 segments 2.2.1 semantic-version 2.10.0 Send2Trash 1.8.3 sentence-transformers 3.0.1 sentencepiece 0.2.0 setuptools 68.2.2 shellingham 1.5.4 six 1.16.0 sniffio 1.3.1 soundfile 0.12.1 soupsieve 2.5 soxr 0.4.0 SQLAlchemy 2.0.31 sse-starlette 2.1.2 stack-data 0.6.3 starlette 0.37.2 stevedore 5.2.0 sympy 1.13.1 tabulate 0.9.0 tblib 3.0.0 tensorboard 2.17.0 tensorboard-data-server 0.7.2 tensorboardX 2.6.2.2 terminado 0.18.1 threadpoolctl 3.5.0 tiktoken 0.7.0 timm 1.0.7 tinycss2 1.3.0 tn 0.0.4 tokenizers 0.19.1 tomli 2.0.1 tomlkit 0.12.0 toolz 0.12.1 torch 2.4.0 torch-complex 0.4.4 torchaudio 2.4.0 torchmetrics 1.4.0.post0 torchvision 0.19.0 tornado 6.4.1 tqdm 4.66.4 traitlets 5.14.3 transformers 4.43.3 triton 3.0.0 typeguard 4.3.0 typer 0.11.1 types-python-dateutil 2.9.0.20240316 typing_extensions 4.12.2 tzdata 2024.1 umap-learn 0.5.6 Unidecode 1.3.8 uri-template 1.3.0 uritemplate 4.1.1 urllib3 2.0.7 uvicorn 0.30.3 uvloop 0.19.0 vector-quantize-pytorch 1.15.6 virtualenv 20.26.3 vocos 0.1.0 wcwidth 0.2.13 webcolors 24.6.0 webencodings 0.5.1 websocket-client 1.8.0 websockets 11.0.3 Werkzeug 3.0.3 WeTextProcessing 1.0.3 wget 3.2 wheel 0.41.2 whisper 1.1.10 widgetsnbextension 4.0.11 wrapt 1.16.0 xinference 0.14.1 xinference-client 0.14.1 xoscar 0.3.2 xxhash 3.4.1 yarl 1.9.4 zipp 3.19.2

Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?

Version info / 版本信息

0.14.1

The command used to start Xinference / 用以启动 xinference 的命令

xinference-local --host 0.0.0.0 --port 9997

Reproduction / 复现过程

  1. curl -X 'POST' 'http://192.168.1.88:9997/v1/audio/transcriptions' \ -H 'accept: application/json' \ -H "Content-Type: multipart/form-data" \ -F file="@./speech.wav \ -F model="SenseVoiceSmall" \ -w "\n时间总计: %{time_total} 秒\n"

Expected behavior / 期待表现

正常转写

codingl2k1 commented 3 months ago

有没有错误信息?

leslie2046 commented 3 months ago

rtf_avg: 0.007, time_speech: 62.041, time_escape: 0.441: 100%|█| 1/1 [00:00<00 0%| | 0/1 [00:00<?, ?it/s]2024-08-10 21:13:17,902 xinference.api.restful_api 247759 ERROR Remote server 0.0.0.0:43083 closed Traceback (most recent call last): File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/api/restful_api.py", line 1275, in create_transcriptions transcription = await model_ref.transcriptions( File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 230, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server 0.0.0.0:43083 closed 2024-08-10 21:13:18,016 xinference.core.worker 247907 WARNING Process 0.0.0.0:43083 is down. 2024-08-10 21:13:18,027 xinference.core.worker 247907 WARNING Recreating model actor SenseVoiceSmall-1-0 ... 2024-08-10 21:13:22,462 xinference.model.utils 247907 INFO Use model cache from a different hub. 2024-08-10 21:13:59,133 xinference.core.worker 247907 ERROR Failed to load model SenseVoiceSmall-1-0 Traceback (most recent call last): File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xinference/core/worker.py", line 882, in launch_builtin_model await model_ref.load() File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 230, in send result = await self._wait(future, actor_ref.address, send_message) # type: ignore File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/context.py", line 115, in _wait return await future File "/home/njue/anaconda3/envs/xinference/lib/python3.10/site-packages/xoscar/backends/core.py", line 84, in _listen raise ServerClosed( xoscar.errors.ServerClosed: Remote server unixsocket:///8518019659595776 closed

leslie2046 commented 3 months ago

一直卡在那里不动

codingl2k1 commented 3 months ago

看着是model进程 crash 了,一般是 OOM 或者 CUDA error 导致

leslie2046 commented 3 months ago

torch 2.0.1+cu118 torchaudio 2.0.2+cu118 这个版本就可以了,神奇

qinxuye commented 3 months ago

TorchAudio 和 torch 版本需要对应。先 close。

leslie2046 commented 3 months ago

@qinxuye 但是用了torch 2.0.1版本的话,就会不满足其他的模型的要求,比如whisper和bge-m3,他们要求更高的版本

leslie2046 commented 2 months ago

我现在是sensevoice单独一个worker torch 2.0.1+cu118 torchaudio 2.0.2+cu118

cosyvoice单独一个worker torch 2.3.0+cu118 torchaudio 2.3.0+cu118