NVIDIA / mellotron

Mellotron: a multispeaker voice synthesis model based on Tacotron 2 GST that can make a voice emote and sing without emotive or singing training data
BSD 3-Clause "New" or "Revised" License
853 stars 187 forks source link

Can I transfer only rhythm or pitch between two audio ? #89

Open niu0717 opened 3 years ago

niu0717 commented 3 years ago

For example,source audio===>P( mel(i) |T(s), S(s), P(s), R(s), Zmel(s) ; θ) & target audio===>P( mel(i) |T(t), S(t), P(t), R(t), Zmel(t) ; θ), how can i just replace S or P of source to target with mellotron. Finally, generate this audio,P( mel(i) |T(s), S(t), P(s), R(s), Zmel(s) ; θ) or P( mel(i) |T(s), S(t), P(t), R(s), Zmel(s) ; θ) . Thx!