FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
5.12k stars 519 forks source link

RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' #445

Open willmyc1 opened 16 hours ago

willmyc1 commented 16 hours ago

问题:通过webui.py运行,推理模式选择预训练音色,点击生成音频报错,服务端显示:RuntimeError: "addmm_implcpu" not implemented for 'Half',具体报错信息如下:

2024-09-27 16:32:01,942 INFO get sft inference request tn 我是通义实验室语音团队全新推出的生成式语音大模型,提供舒适自然的语音合成能力。 to 我是通义实验室语音团队全新推出的生成式语音大模型,提供舒适自然的语音合成能力。 0%| | 0/1 [00:00<?, ?it/s]2024-09-27 16:32:01,983 INFO synthesis text 我是通义实验室语音团队全新推出的生成式语音大模型,提供舒适自然的语音合成能力。 Exception in thread Thread-7: Traceback (most recent call last): File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/root/CosyVoice/cosyvoice/cli/model.py", line 84, in llm_job for i in self.llm.inference(text=text.to(self.device), File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/root/CosyVoice/cosyvoice/llm/llm.py", line 172, in inference text, text_len = self.encode(text, text_len) File "/root/CosyVoice/cosyvoice/llm/llm.py", line 75, in encode encoder_out, encoder_mask = self.text_encoder(text, text_lengths, decoding_chunk_size=1, num_decoding_left_chunks=-1) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/cosyvoice/transformer/encoder/___torch_mangle_5.py", line 22, in forward masks = torch.bitwise_not(torch.unsqueeze(mask, 1)) embed = self.embed _0 = torch.add(torch.matmul(xs, CONSTANTS.c0), CONSTANTS.c1)


    input = torch.layer_norm(_0, [1024], CONSTANTS.c2, CONSTANTS.c3)
    pos_enc = embed.pos_enc

Traceback of TorchScript, original code (most recent call last):
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

^CKeyboard interruption in main thread... closing server.
^CTraceback (most recent call last):
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 2653, in block_thread
    time.sleep(0.1)
KeyboardInterrupt

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "webui.py", line 188, in <module>
    main()
  File "webui.py", line 171, in main
    demo.launch(server_name='0.0.0.0', server_port=args.port)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 2558, in launch
    self.block_thread()
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 2657, in block_thread
    self.server.close()
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/http_server.py", line 68, in close
    self.thread.join(timeout=5)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 1015, in join
    self._wait_for_tstate_lock(timeout=max(timeout, 0))
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 1027, in _wait_for_tstate_lock
    elif lock.acquire(block, timeout):
KeyboardInterrupt
czydfj commented 15 hours ago

I also have the same problem:

%%%%%%%%%%%%%%%%%%%%%%%%%%%% 2024-09-27 17:25:03,941 INFO synthesis text 我是通义实验室语音团队全新推出的生成式语音大模型,提供舒适自然的语音合成能力、 Exception in thread Thread-7: Traceback (most recent call last): File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/data/home/ghostchen/CosyVoice/cosyvoice/cli/model.py", line 84, in llm_job for i in self.llm.inference(text=text.to(self.device), File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/data/home/ghostchen/CosyVoice/cosyvoice/llm/llm.py", line 172, in inference text, text_len = self.encode(text, text_len) File "/data/home/ghostchen/CosyVoice/cosyvoice/llm/llm.py", line 75, in encode encoder_out, encoder_mask = self.text_encoder(text, text_lengths, decoding_chunk_size=1, num_decoding_left_chunks=-1) File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/cosyvoice/transformer/encoder/___torch_mangle_5.py", line 22, in forward masks = torch.bitwise_not(torch.unsqueeze(mask, 1)) embed = self.embed _0 = torch.add(torch.matmul(xs, CONSTANTS.c0), CONSTANTS.c1)


    input = torch.layer_norm(_0, [1024], CONSTANTS.c2, CONSTANTS.c3)
    pos_enc = embed.pos_enc

Traceback of TorchScript, original code (most recent call last):
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'

  0%|                                                                                                              | 0/1 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/queueing.py", line 521, in process_events
    response = await route_utils.call_process_api(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/route_utils.py", line 276, in call_process_api
    output = await app.get_blocks().process_api(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1945, in process_api
    result = await self.call_function(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1525, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 655, in async_iteration
    return await iterator.__anext__()
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 648, in __anext__
    return await anyio.to_thread.run_sync(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 2357, in run_sync_in_worker_thread
    return await future
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 864, in run
    result = context.run(func, *args)
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 631, in run_sync_iterator_async
    return next(iterator)
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 814, in gen_wrapper
    response = next(iterator)
  File "webui.py", line 114, in generate_audio
    for i in cosyvoice.inference_sft(tts_text, sft_dropdown, stream=stream, speed=speed):
  File "/data/home/ghostchen/CosyVoice/cosyvoice/cli/cosyvoice.py", line 61, in inference_sft
    for model_output in self.model.tts(**model_input, stream=stream, speed=speed):
  File "/data/home/ghostchen/CosyVoice/cosyvoice/cli/model.py", line 180, in tts
    this_tts_speech = self.token2wav(token=this_tts_speech_token,
  File "/data/home/ghostchen/CosyVoice/cosyvoice/cli/model.py", line 98, in token2wav
    tts_mel = self.flow.inference(token=token.to(self.device),
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/home/ghostchen/CosyVoice/cosyvoice/flow/flow.py", line 122, in inference
    token = self.input_embedding(torch.clamp(token, min=0)) * mask
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/data/home/ghostchen/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)
RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'