PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
10.57k stars 1.81k forks source link

Cantonese finetune效果比较差 #3285

Open mayuanyang opened 1 year ago

mayuanyang commented 1 year ago

尝试用tts_finetune的模式去做广东话克隆,声音是像的,可是就是电流声大,训练数据是来自于自己的麦克风录音,训练数据听起来很清晰的,可是finetune出来的结果就是大“震音/电流音”, 尝试用其他TTS生成的wav和差不多的量作为训练题材,克隆出来的效果很不错。请问效果不好是因为录音问题吗?

Finetune的步骤

  1. 用MFA对齐
  2. 使用了预训练模型fastspeech2_canton_ckpt_1.4.0.zip 和 pwg_aishell3_ckpt_0.5.zip

这个是训练出来的样本 170.wav.zip

zxcd commented 1 year ago

自己录制的麦克风数据的采样率是什么样的?与预训练数据相同吗

mayuanyang commented 1 year ago

自己录制的麦克风数据的采样率是什么样的?与预训练数据相同吗

有尝试过16k,24k还有42k,但是效果都差不多

stale[bot] commented 9 months ago

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.