PlayVoice / lora-svc

singing voice change based on whisper, and lora for singing voice clone
MIT License
615 stars 77 forks source link

I have a question about using lora for fine-tuning #64

Open h0ngc opened 1 year ago

h0ngc commented 1 year ago

I have trained VITS model now and when I apply LORA to attention layer, fine-tuning is not working properly, could you please tell me which layer you applied to fine-tune VITS model with LORA and what values you used for rank and alpha ?

MaxMax2016 commented 1 year ago

there has no vits, just a bigvgan, after upsample layers use speaker info to change x with weights and biases.

h0ngc commented 1 year ago

Thanks for your reply. Can I ask one more thing? While i'm checking your repo, i noticed that you set the conv_post, activation and speaker_adaptor to be trainable. As i know, LoRA is something like attaching linear layers to adapt other weights, but your repo seems like fine-tuning part of the model. Is it some other adaptation of LoRA?

MaxMax2016 commented 1 year ago

lora is low rank adapter, here is another adapter from micsoft AdaSpeech: Adaptive Text to Speech for Custom Voice or Adapter-Based Extension of Multi-Speaker Text-to-Speech Model for New Speakers

MaxMax2016 commented 1 year ago

lora_svc is not real lora, use this name is just want svc developers to think about lora.