openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
Apache License 2.0
2.69k stars 283 forks source link

Change variance retaking masks from zero inputs to zero embeddings #104

Closed yqzhishen closed 1 year ago

yqzhishen commented 1 year ago

There are 4 modifications to the variance retaking mechanism in this pull request:

  1. Zero masks are applied to embeddings instead of inputs.
  2. The retaking embedding for variance parameters is removed.
  3. Masks of each variance parameter are decoupled so that they can be retaken separately.
  4. Generation algorithm of retaking masks is improved.

NOTE: These modifications are backward-incompatible. All models involving pitch and variance predictions need to be re-trained.