So I have a question about the preprocessing for Wavenet when using input type 'raw'. In this case, the input is assumed to be in the range [-1,1] which is why rescaling is required. However, when doing the rescaling, the input "wav" is rescaled together with preem_wav, which is used as the input from which the spectrogram is generated. My question is, do we need in fact to rescale both wav and preem_wav or is rescaling the former alone sufficient. I believe rescaling the input to the spectrogram generator discards some information which is needed for different tasks (Im using Tacotron as part of an optimizer I'm building for a larger task).
Thank you
Howdy:)
So I have a question about the preprocessing for Wavenet when using input type 'raw'. In this case, the input is assumed to be in the range [-1,1] which is why rescaling is required. However, when doing the rescaling, the input "wav" is rescaled together with preem_wav, which is used as the input from which the spectrogram is generated. My question is, do we need in fact to rescale both wav and preem_wav or is rescaling the former alone sufficient. I believe rescaling the input to the spectrogram generator discards some information which is needed for different tasks (Im using Tacotron as part of an optimizer I'm building for a larger task). Thank you