openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.62k stars 275 forks source link

run inference on my own checkpoint #130

Closed nestyme closed 11 months ago

nestyme commented 11 months ago

Hello! Thank you for this wonderful project! I've trained my own acoustic model for English singing dataset and now trying to figure out how to run inference on my custom data. I see that I need to create .ds file, but don't understand how to do it correctly. Could you please provide some guidance here? Thank you :)

yqzhishen commented 11 months ago

OpenUtau for DiffSinger is the recommended way to test your models with DS files. You can export your model to ONNX, package it as OpenUtau voicebank, edit music scores, lyrics and parameters in the editor and export *.ds files. There are currently several example DS files in samples/ folder, but those are only for the Chinese dictionary.

nestyme commented 11 months ago

@yqzhishen thank you!

blizzard090 commented 3 months ago

hello @yqzhishen, I am grateful for your work! I have been training acoustic model under the guidance of MakeDiffSinger for Vietnamese data. Now I'm entangled how to use the model in the Inference step. First of all, I would like to ask the main question: With Input including 1. Melody Audio and 2. Lyrics, can I use Diffsinger to create a Vocals Audio? The question about this issue, as the instructions above, "You can export your model to onnx, Package it As Openutau Voicebank", I have already been onnx format, but Openutau for Diffsinger does not accept Onnx Format, Can you say more details about Voicbank based on Onnx? I look forward to your response!

yqzhishen commented 3 months ago

@blizzard090 OpenUTAU for DiffSinger has wiki pages for voicebank developers here: https://github.com/xunmengshe/OpenUtau/wiki