Linux服务器上运行的，报错：RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 和RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)

jacksonzjh commented 1 month ago

部署在Linux服务器上的，没有修改requirements.txt内的依赖，运行后克隆声音报错如下：

Traceback (most recent call last): File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 932, in _bootstrap_inner self.run() File "/root/miniconda3/envs/cosyvoice/lib/python3.8/threading.py", line 870, in run self._target(*self._args, *self._kwargs) File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/cli/model.py", line 84, in llm_job for i in self.llm.inference(text=text.to(self.device), File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 35, in generator_context response = gen.send(None) File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/llm/llm.py", line 172, in inference text, text_len = self.encode(text, text_len) File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/llm/llm.py", line 75, in encode encoder_out, encoder_mask = self.text_encoder(text, text_lengths, decoding_chunk_size=1, num_decoding_left_chunks=-1) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl return forward_call(args, **kwargs) RuntimeError: The following operation failed in the TorchScript interpreter. Traceback of TorchScript, serialized code (most recent call last): File "code/torch/cosyvoice/transformer/encoder/___torch_mangle_5.py", line 22, in forward masks = torch.bitwise_not(torch.unsqueeze(mask, 1)) embed = self.embed _0 = torch.add(torch.matmul(xs, CONSTANTS.c0), CONSTANTS.c1)


    input = torch.layer_norm(_0, [1024], CONSTANTS.c2, CONSTANTS.c3)
    pos_enc = embed.pos_enc

Traceback of TorchScript, original code (most recent call last):
**RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'**

  0%|                                                                                                                                     | 0/1 [00:28<?, ?it/s]
Traceback (most recent call last):
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/queueing.py", line 521, in process_events
    response = await route_utils.call_process_api(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/route_utils.py", line 276, in call_process_api
    output = await app.get_blocks().process_api(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1945, in process_api
    result = await self.call_function(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1525, in call_function
    prediction = await utils.async_iteration(iterator)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 655, in async_iteration
    return await iterator.__anext__()
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 648, in __anext__
    return await anyio.to_thread.run_sync(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/to_thread.py", line 56, in run_sync
    return await get_async_backend().run_sync_in_worker_thread(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 2357, in run_sync_in_worker_thread
    return await future
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 864, in run
    result = context.run(func, *args)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 631, in run_sync_iterator_async
    return next(iterator)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 814, in gen_wrapper
    response = next(iterator)
  File "webui.py", line 120, in generate_audio
    for i in cosyvoice.inference_zero_shot(tts_text, prompt_text, prompt_speech_16k, stream=stream, speed=speed):
  File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/cli/cosyvoice.py", line 73, in inference_zero_shot
    for model_output in self.model.tts(**model_input, stream=stream, speed=speed):
  File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/cli/model.py", line 177, in tts
    this_tts_speech = self.token2wav(token=this_tts_speech_token,
  File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/cli/model.py", line 95, in token2wav
    tts_mel = self.flow.inference(token=token.to(self.device),
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/usr/bin/MetaMonAIoT/TTS/CosyVoice/cosyvoice/flow/flow.py", line 122, in inference
    token = self.input_embedding(torch.clamp(token, min=0)) * mask
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/modules/sparse.py", line 162, in forward
    return F.embedding(
  File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/torch/nn/functional.py", line 2210, in embedding
    return torch.embedding(weight, input, padding_idx, scale_grad_by_freq, sparse)
**RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)**

主要的两个问题：
**1. RuntimeError: "addmm_impl_cpu_" not implemented for 'Half'**
**2. RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding)**

服务器上Linux系统上显卡GPU还没做虚拟化，用的显卡配置是Nvidia-1060的卡，CPU是8 vCPUs，16GB内存，200GB挂载磁盘。

非常着急想知道怎么解决这两个问题，感谢🙏

aluminumbox commented 1 month ago

更新代码，试试fp16=False

lin-gooo commented 1 month ago

更新代码，试试fp16=False

fp16=False 时，断言失败了：

Traceback (most recent call last): File "server.py", line 90, in main() File "server.py", line 70, in main cosyvoice_pb2_grpc.add_CosyVoiceServicer_to_server(CosyVoiceServiceImpl(args), grpcServer) File "server.py", line 36, in init self.cosyvoice = CosyVoice(args.model_dir, fp16=False) File "/workspace/CosyVoice/cosyvoice/cli/cosyvoice.py", line 45, in init self.model.load_jit('{}/llm.text_encoder.fp16.zip'.format(model_dir), File "/workspace/CosyVoice/cosyvoice/cli/model.py", line 72, in load_jit assert self.fp16 is True, "we only provide fp16 jit model, set fp16=True if you want to use jit model" AssertionError: we only provide fp16 jit model, set fp16=True if you want to use jit model

teamos-hub commented 3 weeks ago

同样遇到这个问题，直接修改cosyvoice/cli/cosyvoice.py 即可。

def init(self, model_dir, load_jit=False, load_onnx=True, fp16=False):

teamos-hub commented 3 weeks ago

Author: jack jack@teamos.io Date: Mon Nov 4 03:15:25 2024 +0000

fix cpu run

diff --git a/cosyvoice/cli/cosyvoice.py b/cosyvoice/cli/cosyvoice.py index 48babf3..897ef60 100644 --- a/cosyvoice/cli/cosyvoice.py +++ b/cosyvoice/cli/cosyvoice.py @@ -23,7 +23,7 @@ from cosyvoice.utils.file_utils import logging

class CosyVoice:

def init(self, model_dir, load_jit=True, load_onnx=False, fp16=True):
def init(self, model_dir, load_jit=False, load_onnx=True, fp16=False): instruct = True if '-Instruct' in model_dir else False self.model_dir = model_dir if not os.path.exists(model_dir):

zhuk commented 2 weeks ago

是不是没有启用gpu

FunAudioLLM / CosyVoice

Linux服务器上运行的，报错：RuntimeError: "addmm_impl_cpu_" not implemented for 'Half' 和RuntimeError: Expected tensor for argument #1 'indices' to have one of the following scalar types: Long, Int; but got torch.FloatTensor instead (while checking arguments for embedding) #485