yl4579 / StarGANv2-VC

StarGANv2-VC: A Diverse, Unsupervised, Non-parallel Framework for Natural-Sounding Voice Conversion
MIT License
466 stars 110 forks source link

For song vc what should I do #61

Open panxin801 opened 1 year ago

panxin801 commented 1 year ago

Hello and thank you sharing your great work, but I have some questions.

  1. For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
  2. In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.
sophiefy commented 1 year ago

Hello and thank you sharing your great work, but I have some questions.

  1. For song vc with Madarian, I tried train a new starganv2vc model with pretrained ASR and F0 model, but the result sound not well, do you have some advice ?
  2. In song vc with Madarian, do i need to retrain a ASR or F0 model ? I'm looking forward for your reply, and thank you again.

Hello, panxin! I'm also working on singing vc with StarGANv2-VC. I didn't retrain F0 and ASR model. Instead, I made a dataset consisting of Mandarin songs, Mandarin, Japanese and English speech. This is my result.

panxin801 commented 1 year ago

@Francis-Komizu well, thank you for your reply, indeed I think starganvc using for song vc may need some further works to work out

yl4579 commented 1 year ago

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

panxin801 commented 1 year ago

@yl4579 Well, Congratulations. I'm looking forward for your works .

MuruganR96 commented 1 year ago

@panxin801 I'm currently working on singing conversion using this model with some further modifications for better performance. I may submit my work to INTERSPEECH next year.

@yl4579, is INTERSPEECH 2022 September? If yes, can you share the paper link here

yl4579 commented 1 year ago

@mraj96 Sorry, I mean INTERSPEECH next year so it'll be 2023.

mayank-git-hub commented 1 year ago

@yl4579 , thank you for your work on StarGANv2-vc. We have been working on making StarGANv2-vc workable on the singing domain. Please find our work https://arxiv.org/abs/2210.11096 which enhances StarGANv2-vc to make it work on the singing domain while working on any-to-any case.

mayank-git-hub commented 1 year ago

The main modification which makes StarGANv2-VC work on singing voice is the removal of pitch features from the instance normalization layers of the generator and having an absolute pitch reconstruction loss instead of a normalized pitch reconstruction loss.

billnye2 commented 8 months ago

@mayank-git-hub Do you have a github for ROSVC? Couldn't find the source code, very interested!