I was working on re-synthesizing my audio by first getting sequence of character units from HuBERT (hubert_base_ls960) model then passing it to Tacotron 2 model to generate mel-spec which is then used by Waveglow model to finally get the re-synthesized audio back. But when I compare 2 audios there are some strange affects added to my re-synthesized audio. Can you advice me what's wrong going on here.
Hi,
I was working on re-synthesizing my audio by first getting sequence of character units from HuBERT (hubert_base_ls960) model then passing it to Tacotron 2 model to generate mel-spec which is then used by Waveglow model to finally get the re-synthesized audio back. But when I compare 2 audios there are some strange affects added to my re-synthesized audio. Can you advice me what's wrong going on here.
I have attached 2 audio files for your reference.
Thanks
Audio Files