undertheseanlp / underthesea

Underthesea - Vietnamese NLP Toolkit
http://undertheseanlp.com
GNU General Public License v3.0
1.37k stars 270 forks source link

Bug report: Version 6.2.0 TTS say #672

Open MaNgTts opened 1 year ago

MaNgTts commented 1 year ago

Thank you for your imazing works. I have an issue with the new version:

say('Lớp vải bọc ghế sofa lòe loẹt sặc sỡ đến mức kể cả cô Mathilda chắc cũng phải dựng tóc gáy lên.') -> VietTTS crashed If the input string contains "f", "w", "j" or "z", the function crashed with the message: "f" (or w, j, z) is not in the list

Thank you.

fbukevin commented 8 months ago

I got the same issue. Here is some detailed of my sample.

Description

Sentence Cô Brown không thể dịch bài này. Since w is not a character in Vietnamese, the converson function say() throws exception.

Error message

ValueError: 'w' is not in list

Traceback:
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 534, in _run_script
    exec(code, module.__dict__)
File "/Users/fbukevin/Desktop/Project/viethan/app.py", line 14, in <module>
    say(viet_text)
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/underthesea/pipeline/say/__init__.py", line 31, in say
    y = text_to_speech(text)
        ^^^^^^^^^^^^^^^^^^^^
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/underthesea/pipeline/say/__init__.py", line 19, in text_to_speech
    mel = text2mel(
          ^^^^^^^^^
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/underthesea/pipeline/say/viettts_/nat/text2mel.py", line 95, in text2mel
    tokens = text2tokens(text, lexicon_fn)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/underthesea/pipeline/say/viettts_/nat/text2mel.py", line 51, in text2tokens
    p = [phonemes.index(pp) for pp in p]
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/Users/fbukevin/Desktop/Project/viethan/lib/python3.11/site-packages/underthesea/pipeline/say/viettts_/nat/text2mel.py", line 51, in <listcomp>
    p = [phonemes.index(pp) for pp in p]
         ^^^^^^^^^^^^^^^^^^

Reproduce

Bad case

screenshot 20231221 23 33 18@2x

Good case

screenshot 20231221 23 33 34@2x