liusongxiang / ppg-vc

PPG-Based Voice Conversion
Apache License 2.0
328 stars 72 forks source link

finetuning vocoder #32

Open Pked01 opened 1 year ago

Pked01 commented 1 year ago

Hi I am new to field of VC, I am trying to finetune model for specific speakers to generate output more like specific speaker. After going through repo, I though best approach would be to fine tune vocoder.(suggested here https://github.com/liusongxiang/ppg-vc/issues/11) From there I went to this repo (as suggested in the readme) https://github.com/jik876/hifi-gan there when i tested other pretrained model, such as LJ_V3,LJ_FT_T2_V3 when using these models in output i am getting just noise? output : https://drive.google.com/file/d/19xXGF_u0EBtaiFDlg5CkAiEi8Xzc_dd3/view?usp=share_link

--> Can you elaborate on how to finetune on custom data (Specific speakers)? I have 15-20 min data each for 10 speaker ..will that be sufficient? --> is finetuning vocoder sufficient ? for getting speech for specific users? --> how to use other models available on hifigan repo? in ppg repo? what config change i need to do?