在运行Text-To-Speech FastSpeech2 + Parallel WaveGAN on CSMSC，如果想改变最终音频的码率，目前有设置的地方吗？

PaddlePaddle / PaddleSpeech

Easy-to-use Speech Toolkit including Self-Supervised Learning model, SOTA/Streaming ASR with punctuation, Streaming TTS with text frontend, Speaker Verification System, End-to-End Speech Translation and Keyword Spotting. Won NAACL2022 Best Demo Award.

https://paddlespeech.readthedocs.io

Apache License 2.0

11.21k stars 1.86k forks source link

在运行Text-To-Speech FastSpeech2 + Parallel WaveGAN on CSMSC，如果想改变最终音频的码率，目前有设置的地方吗？ #1001

Closed zouhan6806504 closed 3 years ago

zouhan6806504 commented 3 years ago

或者生成后再找工具转码一次？

yt605155624 commented 3 years ago

您可以试试在 https://github.com/PaddlePaddle/PaddleSpeech/blob/develop/paddlespeech/t2s/exps/fastspeech2/synthesize_e2e.py 115行设置 samplerate 为您需要的试试，目前用的是训练模型时设置的采样率

yt605155624 commented 3 years ago

我试了一下，强行修改会改变音调，还是保存之后再修改吧, 在命令行用 sox之类的修改或者保存之后用 librosa 再 load 再保存哈

wav, _ = librosa.load('tmp.wav', sr=16000)

zouhan6806504 commented 3 years ago

我试了一下，强行修改会改变音调，还是保存之后再修改吧, 在命令行用 sox之类的修改或者保存之后用 librosa 再 load 再保存哈
wav, _ = librosa.load('tmp.wav', sr=16000)

是的，我修改后得出的结果也是变调了我用的是ffmpeg，能一次性改成功 ffmpeg -loglevel quiet -i {src} -ar 16000 {target}