ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"
MIT License
1.7k stars 515 forks source link

Wavelet Transform and Inverse for F0 #136

Open rafaelvalle opened 2 years ago

rafaelvalle commented 2 years ago

Thank you for making this repo.

I've attached a jupyter notebook with an implementation of the wavelet transform and inverse for F0 based on the implementation used in the FastSpeech2 paper.

It would be great if you could add to your repo the capability of training FastSpeech2 with the Wavelet Transformed F0 like they describe in the paper.

pitch_cwt.zip

PussyCat0700 commented 6 months ago

Hi,

The reference website in the comment of your notebook is no longer available (should be https://www.isca-speech.org/archive_v0/ssw8/papers/ssw8_285.pdf), do you know where I can find it?

PussyCat0700 commented 6 months ago

Hi,

The reference website in the comment of your notebook is no longer available (should be https://www.isca-speech.org/archive_v0/ssw8/papers/ssw8_285.pdf), do you know where I can find it?

I later found that this paper should be Wavelets for intonation modeling in HMM speech synthesis by Suni et al. cited in appendix C.1 of FastSpeech 2 paper. Sorry for disturbing!