声音有点抖，有点沙哑

begeekmyfriend / tacotron2

Forked from NVIDIA/tacotron2 and merged with Rayhane-mamah/Tacotron-2

BSD 3-Clause "New" or "Revised" License

81 stars 38 forks source link

Closed freecui closed 4 years ago

freecui commented 4 years ago

请听一下我的这个结果，听着某些词或者字有点抖有点沙哑，不知道原因是什么？ 1350.zip

begeekmyfriend commented 4 years ago

我这里一般loss可以降低到0.1，要合成可以键入：

bash scripts/griffin_lim_synth.sh

结合WaveRNN，请用最新的https://github.com/begeekmyfriend/tacotron2/commit/48148f9feba81b4203573ad358d11a0fbb144cd5，训练速度快了一倍

begeekmyfriend commented 4 years ago

不好意思，T2改了padding值：https://github.com/begeekmyfriend/tacotron2/commit/e0ca654aae0fd8f9a8ad7b58af47de7602405c22 之前是针对mel地真值，声码器针对GTA，其实T2模型生成GTA会比地真值网负数方向偏移，所以需要预留空间，避免padding变成噪声。目前同WaveRNN一致。

freecui commented 4 years ago

我发现wavernn中的参数 voc_gen_batched=False之后，发音抖动基本解决；生成的时候不用batch；但是这样会导致用时增加

begeekmyfriend commented 4 years ago

建议你用新提交的补丁再试一下，如果不放心，还可以试试Tensorflow版本

begeekmyfriend commented 4 years ago

4 evaluation examples with multi-speaker has been provided. Feel free to reopen this issue.. t2_wavernn_eval.zip