chomeyama / SiFiGAN

Official implementation of the source-filter HiFiGAN vocoder
MIT License
234 stars 34 forks source link

How to train for many speakers? #4

Closed smlim01 closed 1 year ago

smlim01 commented 1 year ago

Hello. I am trying to train the pretrained model on many-speaker dataset like VCTK.

I am going to prepare a .scp file, extract features, and train as you explained on Readme.

Is there an additional step for training many speakers? For example, prepare a speaker label and calculate each speaker's stats. etc.

chomeyama commented 1 year ago

You don't need additional steps for multi-speaker training if you input WORLD features into the neural vocoder. The WORLD features include speaker information sufficiently so the neural vocoder can generate voices without any conditioning on speakers. Also, you can compute stats with overall speakers.

smlim01 commented 1 year ago

Thank you! It was simple question, so I close this issue.