jixinya / EVP

Code for paper 'Audio-Driven Emotional Video Portraits'.
291 stars 49 forks source link

The meaning of the output #13

Open RegisWu opened 2 years ago

RegisWu commented 2 years ago

Hi and thanks for sharing the great work!

I follow your instructions and successfully run the test scripts to generate output. But I am confused about the meaning of the lm2video output 'M003_01_3_output_01/', 'M003_02_3_output_01'...

Could you please briefly explain what is the driving audio for synthesizing the results and where can we find the corresponding lip-sync video?

Looking forward to your response. Thanks.

jixinya commented 2 years ago

The output naming rules of M003 are different from others (M009,M030 etc.) due to the update of MEAD dataset. The old rule follows the order of sentences while the new rule uses the emotion category. Specifically, 'M003_72_1_output_01' means the result of the '01' sentence on page '72' and the emotion category is neutral (page number: angry: 1-8, disgust: 9-16, contempt: 17-24, fear: 25-33, happy: 34-43, sad: 44-52, surprised: 53-61, neutral: 62-73 ) , intensity is '1'. We upload the audio of M003 with old naming rules here (https://drive.google.com/file/d/1aLBfLJ3TMIKnLY1ZQtxUXeXXNMuyuqw9/view?usp=sharing) for inference. There are three emotion intensities in MEAD, while we only use the strongest intensity in EVP. And 'M030_angry_001' means the result of the '001' sentence of the 'angry' emotion.

H4mster commented 1 year ago

The output naming rules of M003 are different from others (M009,M030 etc.) due to the update of MEAD dataset. The old rule follows the order of sentences while the new rule uses the emotion category. Specifically, 'M003_72_1_output_01' means the result of the '01' sentence on page '72' and the emotion category is neutral (page number: angry: 1-8, disgust: 9-16, contempt: 17-24, fear: 25-33, happy: 34-43, sad: 44-52, surprised: 53-61, neutral: 62-73 ) , intensity is '1'. We upload the audio of M003 with old naming rules here (https://drive.google.com/file/d/1aLBfLJ3TMIKnLY1ZQtxUXeXXNMuyuqw9/view?usp=sharing) for inference. There are three emotion intensities in MEAD, while we only use the strongest intensity in EVP. And 'M030_angry_001' means the result of the '001' sentence of the 'angry' emotion.

The link has expired. Can you resend a link?