FunAudioLLM / CosyVoice

Multi-lingual large voice generation model, providing inference, training and deployment full-stack ability.
https://funaudiollm.github.io/
Apache License 2.0
6.53k stars 702 forks source link

Fail to run branch 'inference_streaming' #268

Open sunyanqing opened 3 months ago

sunyanqing commented 3 months ago

Describe the bug I fail to run on branch 'inference_streaming'.

To Reproduce Steps to reproduce the behavior:

  1. git checkout inference_streaming
  2. python webui.py --port 50000 --model_dir pretrained_models/CosyVoice-300M

Expected behavior Could run well.

Screenshots Traceback (most recent call last): File "webui.py", line 179, in cosyvoice = CosyVoice(args.model_dir) File "/root/CosyVoice/cosyvoice/cli/cosyvoice.py", line 30, in init configs = load_hyperpyyaml(f) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/hyperpyyaml/core.py", line 188, in load_hyperpyyaml hparams = yaml.load(yaml_stream, Loader=loader) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/yaml/init.py", line 81, in load return loader.get_single_data() File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 116, in get_single_data return self.construct_document(node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 120, in construct_document data = self.construct_object(node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 147, in construct_object data = self.construct_non_recursive_object(node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 188, in construct_non_recursive_object for _dummy in generator: File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 633, in construct_yaml_map value = self.construct_mapping(node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 429, in construct_mapping return BaseConstructor.construct_mapping(self, node, deep=deep) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 244, in construct_mapping value = self.construct_object(value_node, deep=deep) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 147, in construct_object data = self.construct_non_recursive_object(node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/ruamel/yaml/constructor.py", line 183, in construct_non_recursive_object data = constructor(self, tag_suffix, node) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/hyperpyyaml/core.py", line 481, in _constructobject return callable(*args, **kwargs) TypeError: ('Invalid argument to class cosyvoice.llm.llm.TransformerLM', "init() missing 1 required positional argument: 'sampling'")

Desktop (please complete the following information):

Additional context Add any other context about the problem here.

suansuancwk commented 3 months ago

把examples\libritts\cosyvoice\conf目录的cosyvoice.yaml覆盖模型目录下的cosyvoice.yaml

sunyanqing commented 3 months ago

把examples\libritts\cosyvoice\conf目录的cosyvoice.yaml覆盖模型目录下的cosyvoice.yaml

Thank you, that resolved the initial issue. However, upon revisiting the first batch of speech data, we encountered another problem:

Traceback (most recent call last): File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/queueing.py", line 560, in process_events response = await route_utils.call_process_api( File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/route_utils.py", line 276, in call_process_api output = await app.get_blocks().process_api( File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1945, in process_api result = await self.call_function( File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/blocks.py", line 1525, in call_function prediction = await utils.async_iteration(iterator) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 655, in async_iteration return await iterator.anext() File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 648, in anext return await anyio.to_thread.run_sync( File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, *args) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 631, in run_sync_iterator_async return next(iterator) File "/root/miniconda3/envs/cosyvoice/lib/python3.8/site-packages/gradio/utils.py", line 814, in gen_wrapper response = next(iterator) File "webui.py", line 115, in generate_audio for i in cosyvoice.inference_sft(tts_text, sft_dropdown, stream=stream): File "/root/CosyVoice/cosyvoice/cli/cosyvoice.py", line 53, in inference_sft for model_output in self.model.inference(*model_input, stream=stream): File "/root/CosyVoice/cosyvoice/cli/model.py", line 109, in inference this_tts_speech = fade_in_out(this_tts_speech, cache_speech, self.window) File "/root/CosyVoice/cosyvoice/utils/common.py", line 136, in fade_in_out fade_in_speech[:, :speech_overlap_len] = fade_in_speech[:, :speech_overlap_len] window[:speech_overlap_len] + fade_out_speech[:, -speech_overlap_len:] * window[speech_overlap_len:] RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.

c2j commented 3 months ago

把examples\libritts\cosyvoice\conf目录的cosyvoice.yaml覆盖模型目录下的cosyvoice.yaml

Try it OK in Python 3.11 with torch-2.0.1+cu118

github-actions[bot] commented 2 months ago

This issue is stale because it has been open for 30 days with no activity.

xianqiliu commented 2 months ago

for RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details. modify the following:

## common.py
def fade_in_out(fade_in_mel, fade_out_mel, window):
    device = fade_in_mel.device
    fade_in_mel, fade_out_mel = fade_in_mel.cpu(), fade_out_mel.cpu()
    mel_overlap_len = int(window.shape[0] / 2)
    fade_in_mel = fade_in_mel.clone() ## works for me
    fade_in_mel[..., :mel_overlap_len] = fade_in_mel[..., :mel_overlap_len] * window[:mel_overlap_len] + \
        fade_out_mel[..., -mel_overlap_len:] * window[mel_overlap_len:]
    return fade_in_mel.to(device)