[TTS]关于Style control in FastSpeech2疑问

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

https://paddlespeech.readthedocs.io

Apache License 2.0

10.99k stars 1.83k forks source link

[TTS]关于Style control in FastSpeech2疑问 #3360

Closed laishujie closed 1 year ago

laishujie commented 1 year ago

在文档中https://paddlespeech.readthedocs.io/en/latest/tts/demo.html#style-control-in-fastspeech2 我看是支持语速的微调

demo使用的例子是 demos/style_fs2/style_syn.py

使用的模型是fastspeech2_csmsc-zh

理论上讲只要是FastSpeech2模型都该支持才对，但是下载fastspeech2_aishell3-zh与fastspeech2_csmsc-zh 模型对比缺少energy_stats.npy 与 pitch_stats.npy文件（如图）

截图 2023-06-28 16-45-06

我应该从那里获取或生成出这两个文件，从而支持与例子上提供变速的效果？谢谢

zh794390558 commented 1 year ago

如果缺少的话，可以更具数据集从新生成下，参看对应examples的数据处理脚本。

laishujie commented 1 year ago

如果缺少的话，可以更具数据集从新生成下，参看对应examples的数据处理脚本。

好的，我已经跑了预处理流程补充了这几个文件，但是目前demos/style_fs2/style_syn.py这个例子跑了之后有现象是：杂音，音频也变长了，总之就是无法听0 _0!! 是没有指定说话人导致的吗？