Why is the audio corresponding to the mel feature needed when synthesizing?

ksw0306 / FloWaveNet

A Pytorch implementation of "FloWaveNet: A Generative Flow for Raw Audio"

MIT License

490 stars 109 forks source link

Closed NewEricWang closed 5 years ago

NewEricWang commented 5 years ago

I only want to input the mel-feature generated from tacotron2. How should I modify the script "synthesize.py"?

anupam456 commented 5 years ago

Audio size is required and not the audio. You can just replace a line of code ( in def synthesize in synthesize.py ) code with below modified code.

current code - q_0 = Normal(x.new_zeros(x.size()), x.new_ones(x.size()))

replace the above by q_0 = Normal(c.new_zeros(1,1,c.size()[2]256), c.new_ones(1,1,c.size()[2]256))

256 is the hop_length from preprocessing.py !

NewEricWang commented 5 years ago

@anupam456 ,Thank you! It works.

delgerdalai commented 5 years ago

Hi @NewEricWang, Which tacotron2 project do you use?