OpenTalker / SadTalker

[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
https://sadtalker.github.io/
Other
11.6k stars 2.16k forks source link

using reference video returns cv2.error #448

Open Chocobi-1129 opened 1 year ago

Chocobi-1129 commented 1 year ago

Error appears when i tried to use reference video(.mp4 file), same error also in sd webui. Here's the log:

python inference.py --driven_audio ./input/audio/a5_ch.mp4                     --source_image ./input/video/v1.mp4                     --still                     --preprocess full                     --enhance gfpgan                     --ref_pose ./input/video/v2.mp4
using safetensor as default
3DMM Extraction for source image
landmark Det:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 36.46it/s]
3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 146.69it/s]
3DMM Extraction for the reference video providing pose
landmark Det::  44%|███████████████████████████████████▉                                             | 134/302 [00:03<00:04, 40.86it/s]
Traceback (most recent call last):
  File "inference.py", line 144, in <module>
    main(args)
  File "inference.py", line 69, in main
    ref_pose_coeff_path, _, _ =  preprocess_model.generate(ref_pose, ref_pose_frame_dir, args.preprocess, source_image_flag=False)
  File "/home/hsiang1129/SadTalker/src/utils/preprocess.py", line 124, in generate
    lm = self.propress.predictor.extract_keypoint(frames_pil, landmarks_path)
  File "/home/hsiang1129/SadTalker/src/face3d/extract_kp_videos_safe.py", line 39, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "/home/hsiang1129/SadTalker/src/face3d/extract_kp_videos_safe.py", line 60, in extract_keypoint
    keypoints = landmark_98_to_68(self.detector.get_landmarks(img)) # [0]
  File "/home/hsiang1129/miniforge3/envs/sadtalker/lib/python3.8/site-packages/facexlib/alignment/awing_arch.py", line 363, in get_landmarks
    img = cv2.resize(img, (256, 256))
cv2.error: OpenCV(4.7.0) /io/opencv/modules/imgproc/src/resize.cpp:4062: error: (-215:Assertion failed) !ssize.empty() in function 'resize'

Have done some research and seems like the issue may relate to file format, but not quite sure about it. My opencv version is 4.7.0.72

vinthony commented 1 year ago

It seems some of your frame do NOT contains a human face

Chocobi-1129 commented 1 year ago

Thanks for your reply~

Its sounds resonable why it fails, though I'm pretty sure that the entire video should contain a face.(perhaps the face dection algorithm just cant detect it) I'll also think about if there's solution for this detection issue.

However when i try another reference video, another error occurs as below:

(sadtalker) hsiang1129@test:~/SadTalker$ python inference.py --driven_audio ./input/audio/a1.mp4                     --source_image ./input/video/v1.mp4                     --still                     --preprocess full                     --enhance gfpgan                     --ref_pose ./input/video/input_v17.mp4
using safetensor as default
3DMM Extraction for source image
landmark Det:: 100%|█████████████████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 36.34it/s]
3DMM Extraction In Video:: 100%|████████████████████████████████████████████████████████████████████████| 1/1 [00:00<00:00, 147.29it/s]
3DMM Extraction for the reference video providing pose
landmark Det::  54%|███████████████████████████████████████████▍                                     | 267/498 [00:06<00:05, 41.29it/s]
Traceback (most recent call last):
  File "inference.py", line 144, in <module>
    main(args)
  File "inference.py", line 69, in main
    ref_pose_coeff_path, _, _ =  preprocess_model.generate(ref_pose, ref_pose_frame_dir, args.preprocess, source_image_flag=False)
  File "/home/hsiang1129/SadTalker/src/utils/preprocess.py", line 124, in generate
    lm = self.propress.predictor.extract_keypoint(frames_pil, landmarks_path)
  File "/home/hsiang1129/SadTalker/src/face3d/extract_kp_videos_safe.py", line 39, in extract_keypoint
    current_kp = self.extract_keypoint(image)
  File "/home/hsiang1129/SadTalker/src/face3d/extract_kp_videos_safe.py", line 57, in extract_keypoint
    bboxes = bboxes[0]
IndexError: index 0 is out of bounds for axis 0 with size 0

Seems like the same with #127.Does it also related to face detection issue?

Thanks again and really appreciate your help.

vinthony commented 1 year ago

yes, it seems the frame of 134/267 can not be detected by the algorithm.

Chocobi-1129 commented 1 year ago

got it! Thanks for your great work!