TensorSpeech / TensorFlowTTS

:stuck_out_tongue_closed_eyes: TensorFlowTTS: Real-Time State-of-the-art Speech Synthesis for Tensorflow 2 (supported including English, French, Korean, Chinese, German and Easy to adapt for other languages)
https://tensorspeech.github.io/TensorFlowTTS/
Apache License 2.0
3.85k stars 815 forks source link

Libritts AutoProcessor fails when text contains grammar #344

Closed OscarVanL closed 4 years ago

OscarVanL commented 4 years ago

Hi,

I'm trying to use the Libritts AutoProcessor for inference on my FastSpeech2 Model.

processor = AutoProcessor.from_pretrained(
    pretrained_path="../../tensorflow_tts/processor/pretrained/libritts_mapper.json"
)
processor.mode = False 

The above processor.mode = False is added because the text_to_sequence function does not convert the text to Phonemes by default:

https://github.com/TensorSpeech/TensorFlowTTS/blob/e42595abbf21208c81e0fabaa0b1eaeaca2c4053/tensorflow_tts/processor/libritts.py#L89-L95

But then when I process a text sequence, it will fail with some edge cases such as grammar.

input_text = "hello world, this is a test"
input_ids = processor.text_to_sequence(input_text)
Processing error ```python KeyError Traceback (most recent call last) in 1 input_text = "hello world, this is a test" 2 ----> 3 input_ids = processor.text_to_sequence(input_text) 4 print(input_ids) ~\Anaconda3\envs\TensorFlowTTS\lib\site-packages\tensorflow_tts\processor\libritts.py in text_to_sequence(self, text) 93 return self.symbols_to_ids(self.clean_g2p(text.split(" "))) 94 else: ---> 95 return self.inference_text_to_seq(text) 96 97 def inference_text_to_seq(self, text: str): ~\Anaconda3\envs\TensorFlowTTS\lib\site-packages\tensorflow_tts\processor\libritts.py in inference_text_to_seq(self, text) 96 97 def inference_text_to_seq(self, text: str): ---> 98 return self.symbols_to_ids(self.text_to_ph(text)) 99 100 def symbols_to_ids(self, symbols_list: list): ~\Anaconda3\envs\TensorFlowTTS\lib\site-packages\tensorflow_tts\processor\libritts.py in symbols_to_ids(self, symbols_list) 99 100 def symbols_to_ids(self, symbols_list: list): --> 101 return [self.symbol_to_id[s] for s in symbols_list] 102 103 def text_to_ph(self, text: str): ~\Anaconda3\envs\TensorFlowTTS\lib\site-packages\tensorflow_tts\processor\libritts.py in (.0) 99 100 def symbols_to_ids(self, symbols_list: list): --> 101 return [self.symbol_to_id[s] for s in symbols_list] 102 103 def text_to_ph(self, text: str): KeyError: '@,' ```

This problem has previously been discussed in https://github.com/TensorSpeech/TensorFlowTTS/issues/243. One approach was to replace the grammar with pauses. Is there any reason the included processor does not do this by itself?

machineko commented 4 years ago

Ohh I see its not an error in the processor its just u didn't add grammar to ur mapper so it cant convert it to pause/silence token :)

,.? etc. shouldn't be considered pauses in a phoneme way so this is working as intended basically you can just add extra mapper urself.

machineko commented 4 years ago

https://github.com/TensorSpeech/TensorFlowTTS/blob/e42595abbf21208c81e0fabaa0b1eaeaca2c4053/tensorflow_tts/processor/base_processor.py#L126

Here u can just add extra tokens to map it into silence tokens :)

OscarVanL commented 4 years ago

I understand now :) Thank you