shivammehta25 / Matcha-TTS

[ICASSP 2024] 🍵 Matcha-TTS: A fast TTS architecture with conditional flow matching
https://shivammehta25.github.io/Matcha-TTS/
MIT License
748 stars 96 forks source link

Matcha-TTS app error: symbol_id = _symbol_to_id[symbol] KeyError: '(' #89

Open xddun opened 3 months ago

xddun commented 3 months ago

run project:https://huggingface.co/spaces/shivammehta25/Matcha-TTS/tree/main

but:

python app.py

[+] Model already present at /home/xiedong/.local/share/matcha_tts/matcha_ljspeech.ckpt! [+] Model already present at /home/xiedong/.local/share/matcha_tts/hifigan_T2_v1! [+] Model already present at /home/xiedong/.local/share/matcha_tts/matcha_vctk.ckpt! [+] Model already present at /home/xiedong/.local/share/matcha_tts/hifigan_univ_v1! [+] GPU Available! Using GPU [!] Loading matcha_ljspeech! [+] matcha_ljspeech loaded! [!] Loading hifigan_T2_v1! /ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/torch/nn/utils/weight_norm.py:28: UserWarning: torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm. warnings.warn("torch.nn.utils.weight_norm is deprecated in favor of torch.nn.utils.parametrizations.weight_norm.") Removing weight norm... [+] hifigan_T2_v1 loaded! [!] Loading matcha_vctk! [+] matcha_vctk loaded! [!] Loading hifigan_univ_v1! Removing weight norm... [+] hifigan_univ_v1 loaded! Caching examples at: '/ssd/xiedong/tts/Matcha-TTS/gradio_cached_examples/35' Caching example 1/7 [1] - Input text: We propose Matcha-TTS, a new approach to non-autoregressive neural TTS, that uses conditional flow matching (similar to rectified flows) to speed up O D E-based speech synthesis. Traceback (most recent call last): File "/ssd/xiedong/tts/Matcha-TTS/app.py", line 209, in examples = gr.Examples( # pylint: disable=unused-variable File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/helpers.py", line 75, in create_examples client_utils.synchronize_async(examples_obj.create) File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio_client/utils.py", line 527, in synchronize_async return fsspec.asyn.sync(fsspec.asyn.get_loop(), func, *args, kwargs) # type: ignore File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/fsspec/asyn.py", line 103, in sync raise return_result File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/fsspec/asyn.py", line 56, in _runner result[0] = await coro File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/helpers.py", line 277, in create await self.cache() File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/helpers.py", line 337, in cache prediction = await Context.root_block.process_api( File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/blocks.py", line 1437, in process_api result = await self.call_function( File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/blocks.py", line 1109, in call_function prediction = await anyio.to_thread.run_sync( File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/anyio/to_thread.py", line 56, in run_sync return await get_async_backend().run_sync_in_worker_thread( File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 2177, in run_sync_in_worker_thread return await future File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/anyio/_backends/_asyncio.py", line 859, in run result = context.run(func, args) File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/gradio/utils.py", line 641, in wrapper response = f(args, kwargs) File "/ssd/xiedong/tts/Matcha-TTS/app.py", line 116, in ljspeech_example_cacher phones, text, text_lengths = process_text_gradio(text) File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context return func(*args, **kwargs) File "/ssd/xiedong/tts/Matcha-TTS/app.py", line 74, in process_text_gradio output = process_text(1, text, device) File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/matcha/cli.py", line 51, in process_text intersperse(text_to_sequence(text, ["english_cleaners2"])[0], 0), File "/ssd/xiedong/miniconda3/envs/matcha-tts2/lib/python3.10/site-packages/matcha/text/init.py", line 22, in text_to_sequence symbol_id = _symbol_to_id[symbol] KeyError: '(' IMPORTANT: You are using gradio version 3.43.2, however version 4.29.0 is available, please upgrade.

xddun commented 3 months ago

Why can't my computer pass through the parentheses symbol in this example? It works fine after I delete it.

image

shivammehta25 commented 3 months ago

This is so strange it should be ignored when doing lookup! But if the symbol is not in the symbols file, It won't generate it. https://github.com/shivammehta25/Matcha-TTS/blob/main/matcha/text/symbols.py

xddun commented 3 months ago

Haha, indeed, I couldn't find the parentheses. Could it be an issue with the coding software I installed? I installed the required packages like this: sudo apt-get install festival espeak-ng mbrola -y.

shivammehta25 commented 3 months ago

This seems to be correct! I guess one way would be to surround the call with a try except! https://github.com/shivammehta25/Matcha-TTS/blob/d31cd92a6122fb99987715248941c96744bf0a36/matcha/text/__init__.py#L22

zhaojingxin123 commented 3 months ago

Haha, indeed, I couldn't find the parentheses. Could it be an issue with the coding software I installed? I installed the required packages like this: sudo apt-get install festival espeak-ng mbrola -y.

你用中文普通话数据集试验过吗?Have you tried it with Mandarin Chinese data sets?