begeekmyfriend / tacotron2

Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2
BSD 3-Clause "New" or "Revised" License
81 stars 38 forks source link

声音有点抖,有点沙哑 #11

Closed freecui closed 4 years ago

freecui commented 4 years ago

请听一下我的这个结果,听着某些词或者字有点抖有点沙哑,不知道原因是什么? 1350.zip

begeekmyfriend commented 4 years ago

我这里一般loss可以降低到0.1,要合成可以键入:

bash scripts/griffin_lim_synth.sh

结合WaveRNN,请用最新的https://github.com/begeekmyfriend/tacotron2/commit/48148f9feba81b4203573ad358d11a0fbb144cd5,训练速度快了一倍

begeekmyfriend commented 4 years ago

不好意思,T2改了padding值:https://github.com/begeekmyfriend/tacotron2/commit/e0ca654aae0fd8f9a8ad7b58af47de7602405c22 之前是针对mel地真值,声码器针对GTA,其实T2模型生成GTA会比地真值网负数方向偏移,所以需要预留空间,避免padding变成噪声。目前同WaveRNN一致。

freecui commented 4 years ago

我发现wavernn中的参数 voc_gen_batched=False之后,发音抖动基本解决;生成的时候不用batch;但是这样会导致用时增加

begeekmyfriend commented 4 years ago

建议你用新提交的补丁再试一下,如果不放心,还可以试试Tensorflow版本

begeekmyfriend commented 4 years ago

4 evaluation examples with multi-speaker has been provided. Feel free to reopen this issue.. t2_wavernn_eval.zip