Better normalization of variance parameters - Githubissues

openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Apache License 2.0

2.62k stars 275 forks source link

Better normalization of variance parameters #152

Open yqzhishen opened 8 months ago

yqzhishen commented 8 months ago

Motivation

Absolute values of some variance parameters are two large. Normalize them to the similar scale may improve the performance and benifits fine-tuning.
Sometimes linear normalization has trade-offs between precision and range. For example, delta_pitch needs more precision around 0, but also needs a wider range than the current default ±8 keys in some situations.

TODO

[ ] New option to normalize variance parameters to (-1, 1) before they are embedded into the model
[ ] Multiple types of normalization: linear, tanh, etc.
[ ] Generalize configuration schemas of all parameters