Melody encoder and ornaments modeling

openvpi / DiffSinger

An advanced singing voice synthesis system with high fidelity, expressiveness, controllability and flexibility based on DiffSinger: Singing Voice Synthesis via Shallow Diffusion Mechanism

Apache License 2.0

2.69k stars 283 forks source link

Melody encoder and ornaments modeling #142

Closed yqzhishen closed 12 months ago

yqzhishen commented 1 year ago

Melody encoder directly calculates attention on the note sequene besides the linguistic features. With this new method of melody modeling, the pitch predictor gains more sensitiveness on the pitch trend in the music scores, thus imroving accuracy and stability on short slurs, long vibratos and out-of-range notes. In addition, this note-level encoder can also accept ornament tags as input, for example, the glides.

yakotoka commented 1 year ago

Is this for the pitch model? Does pitch model training make use of this implementation or this is post pitch model creation? Thank you!

yqzhishen commented 1 year ago

@yakotoka This is an improvement to the current pitch model, but models with and without melody encoder will not be compatible with each other.

ghhcbef1 commented 11 months ago

openutau现在实现这个功能了吗?

yqzhishen commented 11 months ago

@ghhcbef1 还没有，可能需要等一阵子