Closed ak9250 closed 3 years ago
Yes, this is expected.
ok thanks
@KimythAnly what are some ways this could be improved for singing synthesis in particular going from a singer source identity to a target speaker identity?
Hmm If you have some singing corpus, then you can just train a model using that data. Also, use f0 as an additional feature is useful. As far as I know, many singing VC systems use this feature.
@KimythAnly ok thanks the demo page did show going from singing to speaker to singing but the quality is a bit degraded I will also look into this approach https://nobody996.github.io/FastSVC/
I tried the first sample here as input https://speechresearch.github.io/hifisinger/ and the output sounds like this, it this expected? https://soundcloud.com/user-426165954/7000000184-to-p226-001