marcoppasini / MelGAN-VC

MelGAN-VC: Voice Conversion and Audio Style Transfer on arbitrarily long samples using Spectrograms
MIT License
228 stars 53 forks source link

Metalic voice #14

Closed Dannynis closed 4 years ago

Dannynis commented 4 years ago

Hey, ive came across your repo and it is extremely helpful and impressive. Ive run an experiment with about 30 minutes of train data for source and target speakers which resulted pretty quicky very impressive impression of the target but the speech was quite distorted (like voice sound through a fan) and the g loss got stuck around 0.6 while the d loss around 0.4 (Ive used the default parameters for training) have you tried any harmonic vocoder such as WORLD with the same architecture? In cycle gan with WORLD I manage to get very clear generated speech that is far from the target. Do you have any recommendations for me for how to tackle this issue ? Thank you !

kayuksel commented 3 years ago

@Dannynis Did you find any solution to that?