openvpi / DiffSingerMiniEngine

A minimum inference engine for DiffSinger
GNU Affero General Public License v3.0
32 stars 8 forks source link

Please update this so it works for latest generation diffsinger models that have linguistic.onnx models #2

Open yakotoka opened 1 year ago

yakotoka commented 1 year ago

So it looks like newer generation diffsinger models now have linguistic models that take in tokens, word divisions and word durations where the output is encoder_out and x_masks which then feed to the duration.onnx model

Example below(please tell me the if zeroes are needed in the below example) results = linguistic_model.run(None, { "tokens":[[26, 1, 22, 35, 11]] , "word_div": [[3,2,0,0,0]], "word_dur": [[48,24,0,0,0]] })

Happy to get your thoughts, thank you!

yqzhishen commented 1 year ago

This project is deprecated now. You can use OpenUTAU for DiffSinger to synthesis with ONNX models. Anyway, this is only a simple demo project, and you can extend it or even re-write it easily