Open shazhongcheng opened 4 years ago
The independent representation is achieved by training the model in a "GAN" style. We first train the pitch regression network to estimate the pitch. Then we train the encoder to confuse the pitch regression network.
Please refer to this paper "A Universal Music Translation Network".
But, how can it do that? I know how to do confusion on classification, just set label to all. But on regression network with mse, I don't know how to do that.
We extract pitch using Kaldi toolkit.
The independent representation is achieved by training the model in a "GAN" style. We first train the pitch regression network to estimate the pitch. Then we train the encoder to confuse the pitch regression network.
Please refer to this paper "A Universal Music Translation Network".
Hi, Deng,
Your work is amazing and great! How could we reproduce ur perfect performance again? Would u please setup the thorough implement of ur paper?
Could there any possibility that source related codes about ur paper be released, this would be a new popular implement and become a SOTA flag in singing conversion!
Looking forward to hear from you!
Sincerely, Luke Huang
I don‘t know you hwo to define mean square error function, from you paper, I think it can‘’t achieve pitch-independent representation:
Hi, Cheng, did you already reproduce this paper? Could there any possible release the implement of this paper?
Or, when you constructing this project, i'm pleasure to help you!
Looking forward to hear from you
Sincerely
Luke Huang
I don‘t know you hwo to define mean square error function, from you paper, I think it can‘’t achieve pitch-independent representation: