Thank you for the great work and the disentanglement of content and emotion features are really novel.
When I re-product this module, I get frustrated about the decoder structure. Could you show me the demo code?
Say we get the concatenation of content and emotion features of shape [Batchsize, N, content_dim+emotion_dim], how to convert it to the mfcc features of shape [Batchsize, N, 28, 12]?
Looking forward to your reply and thank you in advance!
Thank you for the great work and the disentanglement of content and emotion features are really novel. When I re-product this module, I get frustrated about the decoder structure. Could you show me the demo code? Say we get the concatenation of content and emotion features of shape [Batchsize, N, content_dim+emotion_dim], how to convert it to the mfcc features of shape [Batchsize, N, 28, 12]? Looking forward to your reply and thank you in advance!