Closed leslie2046 closed 1 month ago
是调用了并发流式吗?
是的
Parallel generation is not supported by llama-cpp-python 同样的bug cosyvoice已经支持流式生成,但是xinference的tts还不支持流式
xinf 的 cosyvoice 已经支持流式。并发的流式应该会导致这个问题。不确定 cosyvoice 是不是线程安全的。
xinf 的 cosyvoice 已经支持流式。并发的流式应该会导致这个问题。不确定 cosyvoice 是不是线程安全的。
cosyvoice本身应该是线程安全
This issue is stale because it has been open for 7 days with no activity.
This issue was closed because it has been inactive for 5 days since being marked as stale.
System Info / 系統信息
cuda :12.2 python:3.10.14 OS:centos 7.9 Package Version
Running Xinference with Docker? / 是否使用 Docker 运行 Xinfernece?
Version info / 版本信息
0.15.0
The command used to start Xinference / 用以启动 xinference 的命令
nohup xinference-supervisor -H 0.0.0.0 --log-level DEBUG > supervisor.log 2>&1 & nohup xinference-worker -e "http://127.0.0.1:9997/" -H 192.168.1.88 --log-level DEBUG > worker.log 2>&1 &
Reproduction / 复现过程
024-09-11 15:14:34,206 xinference.model.audio.cosyvoice 186418 INFO CosyVoice inference_sft 2024-09-11 15:14:34,207 xinference.core.model 186418 ERROR [request 7f290426-700d-11ef-bf2a-20040ff32e74] Leave speech, error: Parallel generation is not supported by llama-cpp-python., elapsed time: 0 s Traceback (most recent call last): File "/home/njue/anaconda3/envs/cosyvoice/lib/python3.10/site-packages/xinference/core/utils.py", line 69, in wrapped ret = await func(*args, kwargs) File "/home/njue/anaconda3/envs/cosyvoice/lib/python3.10/site-packages/xinference/core/model.py", line 711, in speech return await self._call_wrapper_binary( File "/home/njue/anaconda3/envs/cosyvoice/lib/python3.10/site-packages/xinference/core/model.py", line 410, in _call_wrapper_binary return await self._call_wrapper("binary", fn, *args, *kwargs) File "/home/njue/anaconda3/envs/cosyvoice/lib/python3.10/site-packages/xinference/core/model.py", line 120, in _async_wrapper return await fn(args, kwargs) File "/home/njue/anaconda3/envs/cosyvoice/lib/python3.10/site-packages/xinference/core/model.py", line 427, in _call_wrapper raise Exception("Parallel generation is not supported by llama-cpp-python.") Exception: Parallel generation is not supported by llama-cpp-python. 2024-09-11 15:14:34,208 xinference.core.model 186418 DEBUG After request speech, current serve request count: 0 for the model CosyVoice-300M-SFT-1-0
Expected behavior / 期待表现
能够并行处理请求