Open freecui opened 4 years ago
(I translated your question with google translate.)
Set the parameter "voc_gen_batched" to False in your hparams.py Although batched WaveRNN is much faster than original WaveRNN, it is a trade-off feature. The number of batch size increases (the number of sample in each batch entry decreases), audio generation speed will be faster but the quality of generated sounds worse.
If you disable batched generation feature, the speed of audio generation will be very slow but it will ultimately generate finest results.
Thank you very much, I used to set voc_gen_batched = True , I will train that again setting voc_gen_batched = False
You don't need to re-train your vocoder. voc_gen_batched is for inference only.
The audio voice is better when I set set voc_gen_batched = False for inference, but the consumption time increased from 33.42 seconds to 170 seconds on this utterance ; I want to do real time TTS, can you give me some advice?
@freecui I implemented my own batched mode WaveRNN which is generating "unbatched(which means a single audio clip wasn't separated to multiple segments) multiple audio" at once.
It's still slower than original batched mode and consumes tones of VRAM but way faster than generating audio one by one with unbatched mode.
Maybe you should try that way.
I was focusing more on TTS not WaveRNN so I still don't know how to generate the finest result with batched single audio mode.
As I said, batched WaveRNN inference is trade-off feature. If you want the finest result and faster generation, you'd better implement the feature that generates multiple unbatched audio at once.
If you are focusing more generation time than quality, find the proper hp.voc_target and hp.voc_overlap value that satisfies both generation time and quality.
您好!我想请问一下您几个问题,您是自己训练的中文的合成吗?训练数据是哪里来的呢?这个模型支持中文的吗?期待您的回复!
@freecui would you share your config file? Thanks a lot.
@freecui Would you please share your wavernn training loss ?
@zhangzhenyuyu ,训练数据是内部数据;支持中文模型的
@tsungruihon ,we can use default parameters;
@freecui Glad to hear that. That's a really amazing result. Would you mind sharing your wechat so that we could communicate ? I also focus on Chinese TTS and ASR. My email is petertsengruihon@gmail.com
@freecui By the way, may i ask how many epoch or steps have you trained ?
@freecui 請問一下,要訓練中文語音的話,需要對hparams或其他檔案做更改嗎?
我有在hparams.py裡面看到tts_cleaner_names = ['english_cleaners']
,不知道是否要改成中文
@justln1113,这个要更改的,basic_clearners
@freecui 好的,感謝答覆,還有甚麼需要注意的地方嗎?
@freecui 你好 想问下一下你训练的Loss和Steps是在哪一个位置呢谢谢
@freecui 非常抱歉还想打扰您一下,我使用ts_cleaner_names = ['basic_clearners']遇到了错误,使用它得到的输入x都是空的,我在想是不是应该用transliteration_cleaners。 期待您的回复,谢谢!
@freecui 非常抱歉还想打扰您一下,我使用ts_cleaner_names = ['basic_clearners']遇到了错误,使用它得到的输入x都是空的,我在想是不是应该用transliteration_cleaners。 期待您的回复,谢谢!
For Chinese, basic cleaners can work only if your input is pinyin or phoneme character.
@freecui 兄弟,你那中文生成语音是用拼音吗?拼音格式是怎么样的,我现在按照你的方法训练了900K步,用拼音生成声音时还是不理想
就是拼音字符+声调,声音不理想,可以尝试再训练一定步数后降低学习率
xiaomingzhong notifications@github.com 于2020年4月21日周二 下午3:00写道:
@freecui https://github.com/freecui 兄弟,你那中文生成语音是用拼音吗?拼音格式是怎么样的,我现在按照你的方法训练了900K步,用拼音生成声音时还是不理想
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/fatchord/WaveRNN/issues/170#issuecomment-616992837, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3GCMR6YLIOFYRFJI2I45LRNVABFANCNFSM4KXEXHSA .
就是拼音字符+声调,声音不理想,可以尝试再训练一定步数后降低学习率 xiaomingzhong notifications@github.com 于2020年4月21日周二 下午3:00写道: … @freecui https://github.com/freecui 兄弟,你那中文生成语音是用拼音吗?拼音格式是怎么样的,我现在按照你的方法训练了900K步,用拼音生成声音时还是不理想 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#170 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3GCMR6YLIOFYRFJI2I45LRNVABFANCNFSM4KXEXHSA .
我是用LJSpeech数据集训练的,不知道是不是和这个有关
就是拼音字符+声调,声音不理想,可以尝试再训练一定步数后降低学习率 xiaomingzhong notifications@github.com 于2020年4月21日周二 下午3:00写道: … @freecui https://github.com/freecui 兄弟,你那中文生成语音是用拼音吗?拼音格式是怎么样的,我现在按照你的方法训练了900K步,用拼音生成声音时还是不理想 — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#170 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AG3GCMR6YLIOFYRFJI2I45LRNVABFANCNFSM4KXEXHSA .
我是用LJSpeech数据集训练的,不知道是不是和这个有关
我是想训练出正常一点的语音,现在感觉很机械,1楼那样的我感觉还可以就是我想要的结果
你好 你在ljspeech数据集下训练的结果怎么样? loss值大概是多少?我在训练500Ksteps后效果仍然很差。希望能够得到你的帮助。
我也遇到了楼上的问题,抽了5000条VCTK数据集的语音来从头训练WavRNN(MOL模式),Batch size=64,训练1了450k steps效果还是很糟糕,真心请教您一下,有什么需要注意的地方吗? Loss曲线:
400K steps时候生成的语音:
我也遇到了楼上的问题,抽了5000条VCTK数据集的语音来从头训练WavRNN(MOL模式),Batch size=64,训练1了450k steps效果还是很糟糕,真心请教您一下,有什么需要注意的地方吗? Loss曲线:
400K steps时候生成的语音:
你后来解决了吗?我在aishell3上训练的,遇到了同样的问题。
请听一下我的这个结果,听着某些词或者字有点抖有点沙哑,特别是抖,不知道原因是什么? 1350.zip