NVIDIA / tacotron2

Tacotron 2 - PyTorch implementation with faster-than-realtime inference
BSD 3-Clause "New" or "Revised" License
4.97k stars 1.37k forks source link

Strange affects in re-synthesized audio #604

Open KaushalNaresh opened 10 months ago

KaushalNaresh commented 10 months ago

Hi,

I was working on re-synthesizing my audio by first getting sequence of character units from HuBERT (hubert_base_ls960) model then passing it to Tacotron 2 model to generate mel-spec which is then used by Waveglow model to finally get the re-synthesized audio back. But when I compare 2 audios there are some strange affects added to my re-synthesized audio. Can you advice me what's wrong going on here.

I have attached 2 audio files for your reference.

Thanks

Audio Files