Closed nitinmukesh closed 2 months ago
How to use Video-driven setting
python demo_EDTalk_V.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_driving_path ./test_data/exp_weights/angry.npy --save_path ./output/2.mp4 --face_sr
(EDTalk) C:\usable\EDTalk>python demo_EDTalk_V.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_driving_path ./test_data/exp_weights/angry.npy --save_path ./output/3.mp4 --face_sr
==> loading model
==> loading data
Traceback (most recent call last):
File "demo_EDTalk_V.py", line 163, in <module>
demo = Demo(args)
File "demo_EDTalk_V.py", line 77, in __init__
self.exp_vid_target, self.fps = vid_preprocessing(args.exp_driving_path)
File "demo_EDTalk_V.py", line 36, in vid_preprocessing
fps = vid_dict[2]['video_fps']
KeyError: 'video_fps'
Teeth are all messed up. Any suggestions?
Lip motion
python demo_lip_pose.py --fix_pose --source_path ./test_data/identity_source.jpg --audio_driving_path test_data/teaser.mp3 --save_path ./output/1.mp4 --face_sr
1_512.mp4 Head pose
1_512.mp4 Video-driven setting
I think the problem with this case is that the face_sr module incorrectly mistook the mouth for teeth, so after face_sr this case's mouth is full of teeth.
How to use Video-driven setting
python demo_EDTalk_V.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_driving_path ./test_data/exp_weights/angry.npy --save_path ./output/2.mp4 --face_sr
(EDTalk) C:\usable\EDTalk>python demo_EDTalk_V.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_driving_path ./test_data/exp_weights/angry.npy --save_path ./output/3.mp4 --face_sr ==> loading model ==> loading data Traceback (most recent call last): File "demo_EDTalk_V.py", line 163, in <module> demo = Demo(args) File "demo_EDTalk_V.py", line 77, in __init__ self.exp_vid_target, self.fps = vid_preprocessing(args.exp_driving_path) File "demo_EDTalk_V.py", line 36, in vid_preprocessing fps = vid_dict[2]['video_fps'] KeyError: 'video_fps'
Hi, you can run
python demo_EDTalk_V_using_predefined_exp_weights.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_type "angry" --save_path ./output/2.mp4 --face_sr
or
python demo_EDTalk_V.py --source_path ./test_data/identity_source.jpg --lip_driving_path ./test_data/mouth_source.mp4 --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_driving_path ./test_data/expression_source.mp4 --save_path ./output/2.mp4 --face_sr
when run python demo_EDTalk_V.py, the exp_driving_path should be a video path.
Teeth are all messed up. Any suggestions?
Lip motion
python demo_lip_pose.py --fix_pose --source_path ./test_data/identity_source.jpg --audio_driving_path test_data/teaser.mp3 --save_path ./output/1.mp4 --face_sr
1_512.mp4 Head pose
1_512.mp4 Video-driven setting
Hi, I use new added video-driven script:
python demo_lip_pose_V.py --source_path test_data/identity_source.jpg --lip_driving_path test_data/mouth_source.mp4 --pose_driving_path test_data/pose_source1.mp4 --face_sr
and the results is:
https://github.com/user-attachments/assets/912097cf-ce92-42ca-960b-c4e0906cb0b0
After face_sr:
https://github.com/user-attachments/assets/c4e1a81c-76c1-462a-b671-9c82e37e14ad
And another source person:
https://github.com/user-attachments/assets/4e630594-1dd2-47fb-b367-6be7a700c769
https://github.com/user-attachments/assets/f1a0b477-a120-47a5-b925-00af4ff09781
Welcome to try~
Thank you @tanshuai0219. I will try this now.
Please could you help with, if I have image, audio and I want the expressions like Sad, Happy, etc How could I do that. What is the purpose of NPY files in test_data\exp_weights.
For e.g. A man is sad and speaking I would want to use the image of that man, the dialogue audio and the expression whcih I understand is NPY files. What should be the command syntax.
Thank you @tanshuai0219. I will try this now.
Please could you help with, if I have image, audio and I want the expressions like Sad, Happy, etc How could I do that. What is the purpose of NPY files in test_data\exp_weights.
For e.g. A man is sad and speaking I would want to use the image of that man, the dialogue audio and the expression whcih I understand is NPY files. What should be the command syntax.
try:
python demo_EDTalk_A_using_predefined_exp_weights.py --source_path ./test_data/identity_source.jpg --audio_driving_path ./test_data/mouth_source.wav --pose_driving_path ./test_data/pose_source2.mp4 --exp_type "sad" --save_path ./output/sad.mp4 --face_sr
The results should be: https://github.com/user-attachments/assets/3b069f23-aa8b-4438-8401-345854b2e8c0
https://github.com/user-attachments/assets/d3eb156b-a523-4cf2-9d22-78d7de061bd3
The exp_type can be selected from ['angry', 'contempt', 'disgusted', 'fear', 'happy', 'sad', 'surprised']
Thank you @tanshuai0219. Is there any way to fix the teeth, I have attached both output with/-out face_sr. Expressions are very good. I am planning to release video tutorial for this tool. I am very interested and a lot other too as it provide uniqueness in terms of expressions. I did the video tutorial for AniTalker too.
Angry
https://github.com/user-attachments/assets/d2626c21-38d8-4918-89af-a10fd9219bd7
https://github.com/user-attachments/assets/f2a1348c-435d-4a1f-9a5e-b5ef0d03cb51
Contempt
https://github.com/user-attachments/assets/afbf474c-1973-4375-85e8-9b332a2694f4
https://github.com/user-attachments/assets/d20b3167-557d-4b2a-910a-3279f0df70fc
Disgusted
https://github.com/user-attachments/assets/c8a2df54-1378-4381-98c8-2bb502edbbfc
https://github.com/user-attachments/assets/688935a2-194c-4fbe-8bea-ed9051b1d0f7
Fear
https://github.com/user-attachments/assets/2b85f055-72f3-4af8-b801-adad175c3473
https://github.com/user-attachments/assets/f6f85957-8f87-4b60-9016-e5989dd95a47
Happy
https://github.com/user-attachments/assets/21b9274d-5963-4c8f-90bb-76ff12d1db08
https://github.com/user-attachments/assets/82f7c2c6-874c-417d-9e60-ec569c57386d
Sad
https://github.com/user-attachments/assets/3c8c521a-d69c-4f5a-8e76-3fb5173a2764
https://github.com/user-attachments/assets/e6037e6e-de2a-4870-9798-0e3cece6d9ac
Surprised
Thank you @tanshuai0219. Is there any way to fix the teeth, I have attached both output with/-out face_sr. Expressions are very good. I am planning to release video tutorial for this tool. I am very interested and a lot other too as it provide uniqueness in terms of expressions. I did the video tutorial for AniTalker too.
Angry
angry.mp4 angry_512.mp4 Contempt
contempt.mp4 contempt_512.mp4 Disgusted
disgusted.mp4 disgusted_512.mp4 Fear
fear.mp4 fear_512.mp4 Happy
happy.mp4 happy_512.mp4 Sad
sad.mp4 sad_512.mp4 Surprised
Since we only used a very small amount of data to train the model, the clarity of the teeth is slightly worse. face_sr is currently the simpler way we came up with to solve this problem. When I have time I will use more data and improve the model. Thanks for your interest in EDTalk and thanks to publicize EDTalk .
Thank you @tanshuai0219. Is there any way to fix the teeth, I have attached both output with/-out face_sr. Expressions are very good. I am planning to release video tutorial for this tool. I am very interested and a lot other too as it provide uniqueness in terms of expressions. I did the video tutorial for AniTalker too.
Angry
angry.mp4 angry_512.mp4 Contempt
contempt.mp4 contempt_512.mp4 Disgusted
disgusted.mp4 disgusted_512.mp4 Fear
fear.mp4 fear_512.mp4 Happy
happy.mp4 happy_512.mp4 Sad
sad.mp4 sad_512.mp4 Surprised
Hi, I put your generated cases in the README, thanks for providing interesting cases. If you generate other interesting cases, please contact me. I plan to rebuild the project (https://tanshuai0219.github.io/EDTalk/) and need more examples. If the cases are presented in the project, I will point out the your contribution in the project~
@tanshuai0219
Sure I will create more data and examples of each type of inference.
Just busy creating Video tutorial, once I post that will work on more thorough testing.
Teeth are all messed up. Any suggestions?
Lip motion
python demo_lip_pose.py --fix_pose --source_path ./test_data/identity_source.jpg --audio_driving_path test_data/teaser.mp3 --save_path ./output/1.mp4 --face_sr
https://github.com/user-attachments/assets/22ba02ea-ae01-4fc6-ba6f-1a37ef9038e0
Head pose
https://github.com/user-attachments/assets/fdb78961-239a-49bf-9428-174c7c8704a3
Video-driven setting