Open JasonWei512 opened 4 years ago
效果不行,试试山本隆一的实现: https://github.com/JasonWei512/wavenet_vocoder 先试试直接用 Rayhane-mamah 的 Tacotron-2 输出的 GTA 训练
先例:https://github.com/Rayhane-mamah/Tacotron-2/issues/215 立陶宛语 ground truth fine-tune LJSpeech 模型, 效果很好
WaveNet.zip 300K步 在 Windows 下读不了超参 json,evaluate 时也只能生成最开始一小段
49万步 怒斥 2070 Super 上推理速度为 30~40 it/s,实时率千分之一
Wavenet是用LJSpeech数据集训练的吗?
Wavenet是用LJSpeech数据集训练的吗?
用的标贝的数据集
大佬,wave_pretrained能重新上传下吗,下载解压失败
大佬,wave_pretrained能重新上传下吗,下载解压失败
五个包下载到同一目录,都去掉 .pptx 然后解压。
包里的hparams.py是正确吗?进行synthesize.py遇到错误:
Traceback (most recent call last):
File "synthesize.py", line 100, in
@nmfisher 我看了下,参数应该是对的
我用tacotron生成wav瞬间,用wavenet需要半小时,这是什么原因?显卡1080ti.
我用tacotron生成wav瞬间,用wavenet需要半小时,这是什么原因?显卡1080ti.
原版自回归 WaveNet 就这么慢的。生成的波形里每个采样点的预测值都依赖于该点之前 505 个采样点的值,一秒语音要按顺序预测 36000 次,不可以并行。
I implemented mixture of logistic distributions loss as well as exponential model averaging in #5. According to the Parallel WaveNet paper, exponential model averaging is important for quality.
One difference would be training time. I did finetune the model many times. i.e., train 200k steps -> (change some hyper param and let's see how it works) -> train 200k step (lr starts from initial value) -> ... repeated. This might lead faster convergence.
If I remember correctly I trained the model for over 1000k steps In total.
https://github.com/r9y9/wavenet_vocoder/issues/1#issuecomment-361130247
包里的hparams.py是正确吗?进行synthesize.py遇到错误:
Traceback (most recent call last): File "synthesize.py", line 100, in main() File "synthesize.py", line 92, in main wavenet_synthesize(args, hparams, wave_checkpoint) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/synthesize.py", line 78, in wavenet_synthesize run_synthesis(args, checkpoint_path, output_dir, hparams) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/synthesize.py", line 19, in run_synthesis synth.load(checkpoint_path, hparams) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/synthesizer.py", line 28, in load self.model = create_model(model_name, hparams) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/models/init.py", line 12, in create_model return WaveNet(hparams, init) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/models/wavenet.py", line 192, in init up_layers=len(hparams.upsample_scales), name='SubPixelConvolutionlayer{}'.format(i)) File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/models/modules.py", line 553, in init init_kernel = tf.constant_initializer(self._init_kernel(kernel_size, strides, conv_filters), dtype=tf.float32) if NN_init else None File "/mnt/e/projects/Tacotron-2-Chinese/wavenet_vocoder/models/modules.py", line 653, in _init_kernel init_kernel = np.tile(np.expand_dims(init_kernel, 3), [1, 1, 1, filters]) File "<array_function internals>", line 6, in expand_dims File "/home/hydroxide/.local/lib/python3.6/site-packages/numpy/lib/shape_base.py", line 597, in expand_dims axis = normalize_axis_tuple(axis, out_ndim) File "/home/hydroxide/.local/lib/python3.6/site-packages/numpy/core/numeric.py", line 1327, in normalize_axis_tuple axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis]) File "/home/hydroxide/.local/lib/python3.6/site-packages/numpy/core/numeric.py", line 1327, in axis = tuple([normalize_axis_index(ax, ndim, argname) for ax in axis]) numpy.AxisError: axis 3 is out of bounds for array of dimension 3
@nmfisher 这个问题你解决了嘛?我也遇到同样的问题
作者,您好。您能把训练好的wavenet模型分享一下么,时间紧,训练久,毕业在即。哈哈,因为wavenet模型不修改
作者,您好。您能把训练好的wavenet模型分享一下么,时间紧,训练久,毕业在即。哈哈,因为wavenet模型不修改
把一楼那五个wave_pretrained解压
不,作者。我的意思是您有训练好生成的wavenet模型么?而非生成的语音效果,我tacotron2训练了,Wavenet还没预训练
不,作者。我的意思是您有训练好生成的wavenet模型么?而非生成的语音效果,我tacotron2训练了,Wavenet还没预训练
不,作者。我的意思是您有训练好生成的wavenet模型么?而非生成的语音效果,我tacotron2训练了,Wavenet还没预训练
请问你试了吗?效果怎么样?
@nmfisher @Hunkshang hello, did u get solution to the "numpy.AxisError: axis 3 is out of bounds for array of dimension 3" problem. if so would you kindly share. thank you
@nmfisher @Hunkshang hello, did u get solution to the "numpy.AxisError: axis 3 is out of bounds for array of dimension 3" problem. if so would you kindly share. thank you
I got the same wrong message when training wavenet. Have you solved yet?
魏老师,看了下这个帖子里不少同学都出现同样的报错 numpy.AxisError: axis 3 is out of bounds for array of dimension 3" problem 现在有点怀疑会不是是某个依赖包版本的问题,能请您分享下conda list 的结果看下各个包的版本吗? 谢谢
@gaoyu1983 yes , its a numpy version problem . i've updated the numpy from the recommended version to the next version ,ie ,from numpy == 1.14 to "numpy == 1.15"
Thank you, it really works.
@gaoyu1983 yes , its a numpy version problem . i've updated the numpy from the recommended version to the next version ,ie ,from numpy == 1.14 to "numpy == 1.15"
TTS交流群,VX:WorldSeal,欢迎进群讨论相关问题~
wavnet 我跑了1000k步,啥结果也没有?有大佬指点一下吗,没有用预训练模型,训练数据约80小时。 配置参数:
{
"name": "wavenet_vocoder",
"input_type": "raw",
"quantize_channels": 65536,
"preprocess": "preemphasis",
"postprocess": "inv_preemphasis",
"global_gain_scale": 0.55,
"sample_rate": 22050,
"silence_threshold": 2,
"num_mels": 80,
"fmin": 125,
"fmax": 7600,
"fft_size": 1024,
"hop_size": 256,
"frame_shift_ms": null,
"win_length": 1024,
"win_length_ms": -1.0,
"window": "hann",
"highpass_cutoff": 70.0,
"output_distribution": "Normal",
"log_scale_min": -16.0,
"out_channels": 2,
"layers": 24,
"stacks": 4,
"residual_channels": 128,
"gate_channels": 256,
"skip_out_channels": 128,
"dropout": 0.0,
"kernel_size": 3,
"cin_channels": 80,
"cin_pad": 2,
"upsample_conditional_features": true,
"upsample_net": "ConvInUpsampleNetwork",
"upsample_params": {
"upsample_scales": [
4,
4,
4,
4
]
},
"gin_channels": -1,
"n_speakers": 7,
"pin_memory": true,
"num_workers": 2,
"batch_size": 8,
"optimizer": "Adam",
"optimizer_params": {
"lr": 0.001,
"eps": 1e-08,
"weight_decay": 0.0
},
"lr_schedule": "step_learning_rate_decay",
"lr_schedule_kwargs": {
"anneal_rate": 0.5,
"anneal_interval": 200000
},
"max_train_steps": 1000000,
"nepochs": 2000,
"clip_thresh": -1,
"max_time_sec": null,
"max_time_steps": 10240,
"exponential_moving_average": true,
"ema_decay": 0.9999,
"checkpoint_interval": 100000,
"train_eval_interval": 100000,
"test_eval_epoch_interval": 50,
"save_optimizer_state": true
}
wave_pretrained.z01.pptx wave_pretrained.z02.pptx wave_pretrained.z03.pptx wave_pretrained.z04.pptx wave_pretrained.zip.pptx 把.pptx去掉,解压
Griffin-Lim vs WaveNet.zip