jixinya / EVP

Code for paper 'Audio-Driven Emotional Video Portraits'.
301 stars 49 forks source link

About the DECODER used in the Cross-reconstruction Emotion Disentanglement Module #7

Open DaddyJin opened 3 years ago

DaddyJin commented 3 years ago

Thank you for the great work and the disentanglement of content and emotion features are really novel. When I re-product this module, I get frustrated about the decoder structure. Could you show me the demo code? Say we get the concatenation of content and emotion features of shape [Batchsize, N, content_dim+emotion_dim], how to convert it to the mfcc features of shape [Batchsize, N, 28, 12]? Looking forward to your reply and thank you in advance!

jixinya commented 2 years ago

I have released the training code. You can check it in train/disentanglement/code/models_content_cla.py (class Decoder).