Closed jamescasia closed 2 years ago
Hey @jamescasia i've noticed that you just closed the issue: did you find a solution?
Yes, I was able to convert audio to tacotron's mel spectogram and back to audio. I used TacotronSTFT's mel_spectogram function to properly convert the audio into mels. I then used Waveglow to convert it back, although the quality has declined a little bit. Here's how I did it.
def stft_to_tmel(stft):
stft = stft.unsqueeze(0)
stft = torch.autograd.Variable(stft, requires_grad=False)
mel = TacotronSTFT(filter_length=1024,
hop_length=256,
win_length=1024,
sampling_rate=22050,
mel_fmin=0.0, mel_fmax=8000.0).mel_spectrogram(stft)
mel = torch.squeeze(mel, 0)
return mel
Thank you!
I have a small piece of code wherein I load a librosa sample audio, convert it to mel using
librosa.feature.melspectogram
with the parameters detailed in config.json, and then converting it back to audio with WaveGlow. But all I get is high frequency noise.