winddori2002 / TriAAN-VC

TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion
MIT License
129 stars 12 forks source link

Silence trimming code #6

Closed Souvic closed 1 year ago

Souvic commented 1 year ago

Hi, Thanks for the great work. I could not find a code for trimming silence though the data path suggests that the inputs should be silence trimmed. Is it same in inference time too? Why would we need silence trimming? For faster training? If we keep the silence is the training not more robust?

winddori2002 commented 1 year ago

Hi, I used the librosa package for trimming silence (It can be different depending on the data). However, the VCTK corpus used in the paper is already processed for trimming silence. And the trimmed data is usually effective for train and test phase. Since our objective is reconstructing speech (not silence) the trimming is effective. Furthermore, what we want to extract from the reference audio is the target's voice (or target characteristics) not silence.

Souvic commented 1 year ago

Thanks, will try that.