karanvivekbhargava / obamanet

ObamaNet : Photo-realistic lip-sync from audio (Unofficial port)
MIT License
235 stars 71 forks source link

How to keep the mouth area consistent with the face? #6

Closed xiaoyun4 closed 5 years ago

xiaoyun4 commented 6 years ago

The work is amazing, but when try to test, something makes me confused: As you described in paper, the keypoints are normalized to be invariant to the face location, in-plane and out-of-plane face rotation.

But when i try to testing, I found, in test dataset, the keyponits rotate with the face and the keyponits is consecutive. As the described above, the keyponits generated from audio are invariants the face location, in-plane and out-of-plane face rotation.

Therefore, I want to know how the keypoints keep consistence with the face with different poses and size?

Thanks a lot!

karanvivekbhargava commented 6 years ago

As addressed in the earlier issue. This consistency is kept by normalizing the points. You essentially divide the points by the norm of all the key points.

xiaoyun4 commented 6 years ago

Thanks for your reply. According to you said, it means the keypoints should be normalized? But I am still confused. As you described in section 4.2 in your paper, the keypoints are processed to remove variances, such as face location, face size, in-plane and out-of-plane face rotation. That's means to say, the keypoints are frontalized including the pose and size.

However, the face in the test dataset you provided is not frontalized. the pose of the face is different. Especially, the size of face is different from different videos. But the predicted keypoints from speech is normalized. I want to know how the keypoints keep consistence with the face with different poses and size?

Thanks a lot

xiaoyun4 commented 6 years ago

Could you please release the code about how to get the cropped images and prepare the training dataset?

karanvivekbhargava commented 5 years ago
import matplotlib.pyplot as plt
from glob import glob

filelist = sorted(glob('testing/*.bmp')) # ('images/*.bmp')

for i in range(len(filelist)):
    img = cv2.imread(filelist[i])
    x = int(np.floor((img.shape[1]-256)/2))
    crop_img = img[0:256, x:x+256]
    # cv2.imshow("cropped", crop_img)
    # cv2.waitKey(1000)
    filename = 'testing_cropped_images/' + str(i) + '.bmp' 
    # filename = 'cropped_images/' + str(i) + '.bmp'
    print(filename)
    cv2.imwrite(filename, crop_img)

You might have to make some changes to this @xiaoyun4 . But this should do the trick, or give you an idea on how to do this.