An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism
This PR adds support for tension, a new variance parameter.
Briefly, tension is related to the ratio of the base harmonic to the full harmonics. Low ratio (meaning other harmonics are powerful compared to the base harmonic) stands for high tension.
Tension is normally expected to be a better successor to energy. It controls the strength of singing voice much more effectively, and pushes the expressiveness to a new level.
Other changes in this PR:
Introduce a new DecomposedWaveform class to take apart aperiodict part, harmonic part and each level of harmonics from a waveform.
Remove hparams from all pitch extraction APIs to make them more reusable.
This PR adds support for tension, a new variance parameter.
Briefly, tension is related to the ratio of the base harmonic to the full harmonics. Low ratio (meaning other harmonics are powerful compared to the base harmonic) stands for high tension.
Tension is normally expected to be a better successor to energy. It controls the strength of singing voice much more effectively, and pushes the expressiveness to a new level.
Other changes in this PR:
DecomposedWaveform
class to take apart aperiodict part, harmonic part and each level of harmonics from a waveform.hparams
from all pitch extraction APIs to make them more reusable.