PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.
https://paddlespeech.readthedocs.io
Apache License 2.0
10.99k stars 1.83k forks source link

[TTS]ErnieSAT #2154

Closed yt605155624 closed 2 years ago

yt605155624 commented 2 years ago

Introduction:

PR:

yt605155624 commented 2 years ago

Loss of vtck (8 gpus):

image image
yt605155624 commented 2 years ago

Loss of aishell3 (8 gpus):

image image
yt605155624 commented 2 years ago

Loss of aishell3_vctk (8 gpus): image

image

sixyang commented 2 years ago

请问从听感上来说,ErnieSAT 相对于 fastspeech2+pwgan 有什么区别吗?效果会更好一些?

yt605155624 commented 2 years ago

效果应该差不多, ERNIE-SAT 提供了一个模型解决多个问题的跨模态模型, FS2 是纯声学模型