Rayhane-mamah / Tacotron-2

DeepMind's Tacotron-2 Tensorflow implementation
MIT License
2.25k stars 911 forks source link

Wavenet preprocessing #492

Open andrekassis opened 3 years ago

andrekassis commented 3 years ago

Howdy:)

So I have a question about the preprocessing for Wavenet when using input type 'raw'. In this case, the input is assumed to be in the range [-1,1] which is why rescaling is required. However, when doing the rescaling, the input "wav" is rescaled together with preem_wav, which is used as the input from which the spectrogram is generated. My question is, do we need in fact to rescale both wav and preem_wav or is rescaling the former alone sufficient. I believe rescaling the input to the spectrogram generator discards some information which is needed for different tasks (Im using Tacotron as part of an optimizer I'm building for a larger task). Thank you