Trimming silences - Githubissues

yoosif0 commented 6 years ago

I am getting good results with begeekmyfriend's fork but I am looking for further improvements.

Some audio files have silences at the end. Would that affect the accuracy negatively? Do you think I should trim those parts manually from the wav files?

I think I might not need to trim those silences because for example as you could see from this alignment graph that the model understood by itself the silence part. What do you think?

Update I found out that this alignment graph is not for the training example but for a generated example so this question might be irrelevant now

begeekmyfriend commented 6 years ago

You may adjust audio.find_endpoint for your requirements.

yoosif0 commented 6 years ago

Thank you @begeekmyfriend. I updated my post to be clearer. There is no problem with synthesizing. I am just asking if removing the silences in the preprocessing part would lead to an improvement.

begeekmyfriend commented 6 years ago

You may add librosa.effect.trim method in _process_utterance

keithito / tacotron

Trimming silences #212