Closed Charlottecuc closed 2 years ago
For example, how to inference the DiffSinger model (trainined with the PopCS dataset) with an opencpop song? Is it possible to extract f0 from the opencpop song wav as f0 input, to the DiffSinger(PopCS) at inference? (although I think the extracted gt f0 information is to some extent mixed with the source speaker information and will fail)
Why don't you use https://github.com/MoonInTheRiver/DiffSinger/blob/master/usr/configs/midi/readme.md ? It includes the checkpoints trained on opencpop and it does not use gt f0.
No. What I mean is, the DiffSinger (PopCS version), cannot inference with unseen songs, right?
The songs in the test set of PopCS are the unseen (no overlap with training data) songs.
The songs in the test set of PopCS are the unseen (no overlap with training data) songs.
我的意思是,因为现在PopCS测试集里的歌和训练集是同一个说话人,所以inference的时候,提取的gt f0也是耦合了这个说话人的speaker信息的。目前来看,DiffSinger(PopCS)是不是只能用PopCS这个说话人唱过的歌来测试?(否则用别的人唱的歌的wav提的gt f0和PopCS说话人自身的基频不匹配)。举个例子,比如说如果想用PopCS的音色合成数据集外的《祝你生日快乐》,是不是无法做到? @MoonInTheRiver 感谢
Solved.
@Charlottecuc Em... 所以 inference 可以用不同说话人的样本吗?测试效果如何?
@Charlottecuc I just met this problem as well. I want to use the diffsinger model, trained on the PopCS, to generate a song given the lyrics information provided in the OpenCPoP. Could you please tell me how did you achieve that? Many thanks!!
any solution when inferencing with unseen singer ? @lixucuhk @RayeRen @Opdoop @Charlottecuc @MoonInTheRiver
Hi. Since the DiffSinger(PopCS) needs ground-truth f0 information at inference, is it possible to synthesize an unseen song (with phoneme labels, phoneme duration and notes provided) using the DIffSinger(PopCS) model?