jtchen0528 / PCL-I2G

Unofficial Implementation: Learning Self-Consistency for Deepfake Detection
55 stars 7 forks source link

What's the function of celebahq_crop when preprocess frames? #4

Open Mark-Dou opened 2 years ago

Mark-Dou commented 2 years ago

Hello, could you explain what's the function of celebahq_crop when preprocess frames is? I'm not very clear about that. Thank U!

jtchen0528 commented 2 years ago

It's function is to detect face/faces in an image and then crop the face out. Since the faces in are not aligned in raw videos (or videos in FF++), we have to preprocess the frames by aligning the faces then crop them out.

The code is modified from chail/patch-forensics.

Mark-Dou commented 2 years ago

I have another question: after the celebahq_crop which align the faces in raw videos, however the background of the image have been changed, how to eliminate this consequence where the background of the manipulated images are different from the pristine images?

jtchen0528 commented 2 years ago

Sorry for the late response.

I did not quite get the question. The background of the manipulated image is different from the pristine image that the manipulated face belongs to. The background of the manipulated image belongs to the pristine image of the cropped-out face. The two backgrounds are pristine, the manipulated part is the cropped-out face and the pasted face.

In the paper, they don't consider the consequence of the backgrounds, since only the border of the manipulated faces is inconsistent. The pasted face is inconsistent with the other part of the face. Hence, we can compute the consistency volume between the background and the face.

When we align the faces, we rotate, resize the image so that it match the faces at the same position in every images. So the consistency between the face and the background in a pristine image seems to have minor consequences only.

If we don't align the face, the model would have to find the face first in an image. It involves facial recognition, face detection as well as deepfake detection in the same task. Seems to be too hard to achieve.