openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.73k stars 288 forks source link

Error failed to render #140

Closed gnloop closed 1 year ago

gnloop commented 1 year ago

I made a multi singer model with the 2.0 release, but when I try it I get this error:

1000016595

Exporting onnx of the model was also very fast. Did I do something wrong while training? Tensorboard shows training progress normally.

My old multi singer model with the old refactor branch works perfectly fine.

yqzhishen commented 1 year ago

This seems to be a voicebank packaging issue that should not belong to this repository. The only changes to the ONNX exporter API in v2.0.0 is the default behavior of exporting speaker embeddings and the ONNX operator set version. Please check if you are using the latest version of OpenUtau and you have written the configuration in the voicebank correctly. You can also use Netron to visualize the ONNX model and see whether the inputs and outputs are the same as the former versions.

gnloop commented 1 year ago

It shows up correctly on netron, but I doubt it's an openutau issue because the onnx exported (on colab) in less than a minute. No major errors except a constant folding user warning. I tried everything and have double checked the config. Openutau is indeed updated. I have no idea what else to try.

yqzhishen commented 1 year ago

I checked my own ONNX model and found that the node /fs2/Unsqueeze_3 is related to the gender input. The error message seems confusing, but have you checked your configuration related to the GENC input in your dsconfig.yaml? Also, will it show more useful messages if you switch to CPU rendering instead of using DirectML?

gnloop commented 1 year ago

Using CPU actually gives a new error that mentions missing breathiness input. Could that be because I set use_breathiness_embed: true but didn't train variance yet?

There's no mention of gender nor breathiness in my dsconfig.yaml, just phonemes, acoustic, vocoder and speakers

yqzhishen commented 1 year ago

Ah, that explains everything - OpenUTAU hasn't supported this parameter yet :(

gnloop commented 1 year ago

Is it possible to disable it without having to retrain everything?

yqzhishen commented 1 year ago

I'm sorry but breathiness is a variance parameter, and it has no default values. You have to wait until OpenUTAU supports it.

gnloop commented 1 year ago

I see, thank you very much for the help