OpenTalker / SadTalker

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
https://sadtalker.github.io/
Other
11.62k stars 2.17k forks source link

ref_pose到底怎么用?能用的话请作者回复下能用,不能用也请说明不能用。不要让大家做无畏的尝试 #690

Open ykhasia opened 10 months ago

ykhasia commented 10 months ago

以下是我的命令行,加了--ref_pose想控制人物头部的移动,但生成的结果人物头部根本不是按照ref_pose指定的视频中的人物移动的。和不加--ref_pose没任何区别。。。 请教作者: 1,--ref_pose指定的视频需要和--source_image指定的人物图片有啥关联么?比如长宽比,背景颜色等。。。 2,"OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4" 这个告警是否会导致--ref_pose失效? E:\ProgramFiles\ai\SadTalker>python inference.py --driven_audio E:/products/digital_person/20231030/1.mp3 --source_image E:/products/digital_person/person1/5_bg_512_768.png --preprocess extfull --enhancer gfpgan --result_dir E:/products/digital_person/20231030 --ref_pose E:/products/digital_person/ref9.mp4 using safetensor as default 3DMM Extraction for source image landmark Det:: 100%|█████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 5.45it/s] 3DMM Extraction In Video:: 100%|█████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 4.00it/s] 3DMM Extraction for the reference video providing pose landmark Det:: 100%|█████████████████████████████████████████████████████████████████| 600/600 [00:26<00:00, 22.48it/s] 3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████| 600/600 [00:05<00:00, 112.88it/s] mel:: 100%|██████████████████████████████████████████████████████████████████████| 540/540 [00:00<00:00, 107997.53it/s] audio2exp:: 100%|██████████████████████████████████████████████████████████████████████| 54/54 [00:00<00:00, 61.73it/s] Face Renderer:: 100%|████████████████████████████████████████████████████████████████| 270/270 [00:41<00:00, 6.47it/s] IMAGEIO FFMPEG_WRITER WARNING: input image is not divisible by macro_block_size=16, resizing from (256, 253) to (256, 256) to ensure video compatibility with most codecs and players. To prevent resizing, make your input image divisible by the macro_block_size or set the macro_block_size to 1 (risking incompatibility). Active code page: 65001 The generated video is named E:/products/digital_person/20231030\2023_10_31_19.39.35/5_bg_512_768##1.mp4 OpenCV: FFMPEG: tag 0x5634504d/'MP4V' is not supported with codec id 12 and format 'mp4 / MP4 (MPEG-4 Part 14)' OpenCV: FFMPEG: fallback to use tag 0x7634706d/'mp4v' seamlessClone:: 100%|████████████████████████████████████████████████████████████████| 540/540 [00:40<00:00, 13.47it/s] Active code page: 65001 The generated video is named E:/products/digital_person/20231030\2023_10_31_19.39.35/5_bg_512_768##1_full.mp4 face enhancer.... Face Enhancer:: 100%|████████████████████████████████████████████████████████████████| 540/540 [02:11<00:00, 4.12it/s] Active code page: 65001 The generated video is named E:/products/digital_person/20231030\2023_10_31_19.39.35/5_bg_512_768##1_enhanced.mp4 The generated video is named: E:/products/digital_person/20231030\2023_10_31_19.39.35.mp4

wawaa commented 9 months ago

第二点把 mp4 改成小写就可以。这个只是视频编码的问题,和头部运动应该没啥关系。