How to integrate it with r9y9/Tacotron-2 ?

NVIDIA / nv-wavenet

Reference implementation of real-time autoregressive wavenet inference

BSD 3-Clause "New" or "Revised" License

735 stars 126 forks source link

How to integrate it with r9y9/Tacotron-2 ? #48

Open rishikksh20 opened 6 years ago

rishikksh20 commented 6 years ago

Tacotron-2 implementation of r9y9 (https://github.com/r9y9/Tacotron-2) output the mel-spectrogram but when I give that mel-spectrogram input to nv-wavenet after covert .npy file to torch tensor and do inference then it generates noise. Do I have to do some extra to Tacotron 2 generate mel-spectrogram then input to nv-wavenet for speech synthesis?

gsoul commented 6 years ago

Ideally, you should train on the same spectrograms as you're going to do inference on
At minimum you might want to look through the closed issues of NVIDIA/tacotron2 to find some examples you could base your solution from, like this one: https://github.com/NVIDIA/tacotron2/issues/52

RPrenger commented 6 years ago

@rishikksh20 Are you trying to use the WaveNet from r9y9 for inference, or are you training a WaveNet using the code in the PyTorch directory? If you're using a WaveNet from r9y9, can you post the code you're using to try the bindings?

rishikksh20 commented 6 years ago

@rafaelvalle is it possible to integrate r9y9's Tacotron 2 with this repo ? If yes then what changes are required to do with Tacotron 2 output's Mel spectrogram to compatible with nv-wavenet input Mel spectrogram.

rafaelvalle commented 6 years ago

@rishikksh20 In addition to training Tacotron2 and Wavenet on the same data, the same mel-spectrogram representation has to be used, including FFT and mel params.