openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.62k stars 275 forks source link

Error failed to render #140

Closed gnloop closed 10 months ago

gnloop commented 10 months ago

I made a multi singer model with the 2.0 release, but when I try it I get this error:

1000016595

Exporting onnx of the model was also very fast. Did I do something wrong while training? Tensorboard shows training progress normally.

My old multi singer model with the old refactor branch works perfectly fine.

yqzhishen commented 10 months ago

This seems to be a voicebank packaging issue that should not belong to this repository. The only changes to the ONNX exporter API in v2.0.0 is the default behavior of exporting speaker embeddings and the ONNX operator set version. Please check if you are using the latest version of OpenUtau and you have written the configuration in the voicebank correctly. You can also use Netron to visualize the ONNX model and see whether the inputs and outputs are the same as the former versions.

gnloop commented 10 months ago

It shows up correctly on netron, but I doubt it's an openutau issue because the onnx exported (on colab) in less than a minute. No major errors except a constant folding user warning. I tried everything and have double checked the config. Openutau is indeed updated. I have no idea what else to try.

yqzhishen commented 10 months ago

I checked my own ONNX model and found that the node /fs2/Unsqueeze_3 is related to the gender input. The error message seems confusing, but have you checked your configuration related to the GENC input in your dsconfig.yaml? Also, will it show more useful messages if you switch to CPU rendering instead of using DirectML?

gnloop commented 10 months ago

Using CPU actually gives a new error that mentions missing breathiness input. Could that be because I set use_breathiness_embed: true but didn't train variance yet?

There's no mention of gender nor breathiness in my dsconfig.yaml, just phonemes, acoustic, vocoder and speakers

yqzhishen commented 10 months ago

Ah, that explains everything - OpenUTAU hasn't supported this parameter yet :(

gnloop commented 10 months ago

Is it possible to disable it without having to retrain everything?

yqzhishen commented 10 months ago

I'm sorry but breathiness is a variance parameter, and it has no default values. You have to wait until OpenUTAU supports it.

gnloop commented 10 months ago

I see, thank you very much for the help