timbre transfer from violin(or digital synthesizer) to singing voice?

acids-ircam / ddsp_pytorch

Implementation of Differentiable Digital Signal Processing (DDSP) in Pytorch

Apache License 2.0

451 stars 56 forks source link

timbre transfer from violin(or digital synthesizer) to singing voice? #1

Closed lmaxwell closed 5 years ago

esling commented 5 years ago

We have not tried it, but the current structure of the "synthesizer" is not fit for singing voice (formants are not well modeled by sinusoids). I guess the Neural Source Filter as an output module could be a better choice to perform this ;)

lmaxwell commented 5 years ago

sorry, I dit not notice that z-encoder is not used in Voilon modelling.

If singing voice is trainded together with lots of instruemnts(eg. Nsynth dataset), interpolating of z-vector would result in cross synthesis of instrument and singing voice. I'm interested in what it sounds like. Do you think it is feasible?

esling commented 5 years ago

My guess is that the two domains might be too distinct to obtain real "cross-synthesis" if we use the model as is (and also the fact that it is a simple deterministic AE for now). Maybe enforcing domain confusion in some of the latent could help to avoid having a huge gap between the two distributions

lmaxwell commented 5 years ago

Thank for your comment. I will spend some time do some experiments. close the issue now.