jixinya / EVP

Code for paper 'Audio-Driven Emotional Video Portraits'.
291 stars 49 forks source link

how to generate a demo with self-prepared data? #3

Open Breeze-Zero opened 3 years ago

Breeze-Zero commented 3 years ago

Generalization cannot be proved if only test cases can be generated

jixinya commented 3 years ago

Hi, you can use an arbritrary audio clip as audio input and assign the emotion by changing the emo_feature in audio2lm/test.py. For the generliazation of the target person, you need to replace the example_landmark in test.py with the aligned landmark of the target person and then fine-tune the vid2vid network with the data of target person. We are still working on the generaliztion of the rendering network.

Breeze-Zero commented 3 years ago

Hi, you can use an arbritrary audio clip as audio input and assign the emotion by changing the emo_feature in audio2lm/test.py. For the generliazation of the target person, you need to replace the example_landmark in test.py with the aligned landmark of the target person and then fine-tune the vid2vid network with the data of target person. We are still working on the generaliztion of the rendering network. Thank you for your reply.I tried to use fa= FACE_Alignment. FACE_Alignment (FACE_Alignment. LandmarkStype._3D,Flip_input =False)、fa.get_landmarks(input) to obtain landmarks, but the resulting array type seems to differ from the provided example_landmark npy file.What additional operations are needed?

jixinya commented 3 years ago

Note that the facial landmarks used in our paper (106 points) is different from face_alignment, which is a typical 68 facial keypoints setting. The detection algorithm we used is not open-source due to the related policy of the company. As an alternative way, you could re-train the model by replacing our landmarks setting with commonly used 68 points and we believe it could generate similar results.

MitchellX commented 3 years ago

Hi @jixinya, Could you tell me which 106 points model you are using? JD-106 landmark or Face++ 106landmark

Baenny commented 2 years ago

请注意,我们论文中使用的面部标志(106 个点)与 face_alignment 不同,后者是典型的 68 个面部关键点设置。由于公司的相关政策,我们使用的检测算法没有开源。作为替代方法,您可以通过用常用的 68 个点替换我们的地标设置来重新训练模型,我们相信它可以产生类似的结果。

Note that the facial landmarks used in our paper (106 points) is different from face_alignment, which is a typical 68 facial keypoints setting. The detection algorithm we used is not open-source due to the related policy of the company. As an alternative way, you could re-train the model by replacing our landmarks setting with commonly used 68 points and we believe it could generate similar results.

Hi, @jixinya When will the training code be launched?

Neroaway commented 2 years ago

How do you capture the images in the video? (Do you use a fixed value or face detection algorithm)?Thanks~

chentting commented 2 years ago

pca and mean is what, can i use example_landmark without pca and mean as the input of Lm_encoder