1. |
Learn2Sing: Target Speaker Singing Voice Synthesis by learning from a Singing Teacher |
tts -> svs,1)f0 和 duration 建模, 2)用 GMM 对上述两者建模比 MSE 要好?就像 Wavenet 可以用 GMM 拟合 wave, 3)domain-adaptation 对长音发挥作用 |
2. |
Speech-to-Singing Conversion in an Encoder-Decoder Framework |
codepreserves some of its characteristics (e.g., speaker identity, linguistic content), while modifying certain others (melody, phoneme durations),1)multispeaker用什么vocoder,Griffin-Lim 2)怎么 align speech and sing?直接通过变速拉伸,不管有没有align上, 3)Silent frame removal 模块 |
3. |
Unsupervised Singing Voice Conversion |
|
4. |
WGANSing: A Multi-Voice Singing Voice Synthesizer Based on the Wasserstein-GAN |
code, 1)输入和输出一样长,那一开始输入是怎么铺开到那么长的,用到 frame- wise phoneme annotations,和NPSS一样,原来WGANSING把duration,f0当成已知条件,先铺开 |
5. |
A Combination of Model-based and Feature-based Strategy for Speech-to-Singing Alignment |
alignment approaches |
6. |
A Dual Alignment Scheme for Improved Speech-to-Singing Voice Conversion |
alignment approache |
7. |
A Universal Music Translation Network |
code 1)the pitch of the input audio clip was changed locally |
8. |
Speech-to-singing synthesis: Converting speaking voices to singing voices by controlling acoustic features unique to singing voices |
Model-based STS, non-nn model |
9. |
Learning Singing From Speech |
male,female 的转化,avg f0 |
10. |
I2R Speech2Singing Perfects Everyone’s Singing |
Rhythm correction by DTW |
11. |
DATA EFFICIENT VOICE CLONING FOR NEURAL SINGING SYNTHESIS |
learned speaker embedding,对 new speakers 策略;finetune 策略 |
12. |
HMM-based singing voice synthesis system using pitch-shifted pseudo training data |
singing alignment, Pitch-shifted Pseudo Training |
13. |
A Strategy for Improved Phone-Level Lyrics-to-Audio Alignment for Speech-to-Singing Synthesis,Applying Spectral Normalisation and Efficient Envelope Estimation and Statistical Transformation for the Voice Conversion Challenge 2016 |
Post-processing 把 pitch scale 从 singing 变到 speech 上来,那我们可不可以反变换, phase correction model,可以帮助vocoder? |