ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
MIT License
1.85k stars 539 forks source link

New parameters for sampling_rate=44.1 kHz #164

Open Adibian opened 2 years ago

Adibian commented 2 years ago

Hi Thanks for this great implementation. The sampling rate in my data is 44.1 kHz so what changes are required in the parameters of config files to train the synthesizer? Thanks for any suggestions.

aidosRepoint commented 1 year ago

Hi! I think that a bigger concern for you would be not the SR of the FastSpeech2 model, but rather the vocoder. Because, other than tuning configs and further training of FS2, you should also do the same for the vocoder. You can also try to find a pretrained HiFi-GAN for 44100, but I couldn't find one. Please, let me know if you find it.