questions about the dataset

fjyen commented 4 years ago

Thank you for releasing the codes and dataset!

I have questions while prepare the data.

It needs to pay extra fee for downloading Deep Face Lab (DFL) data in the link https://www.patreon.com/ctrl_shift_face in readme.txt. Is it possible to get the videos for free in some other places?
The landmarks of FF++ in bounding_boxes.zip seem to be the real videos'. Since videos in Deepfake, FaceSwap and Face2Face may have different lengths, do I need to generate the landmarks of manipulated videos by myself or there are some mapping rules for them?

Thanks!

JStehouwer commented 4 years ago

Hi fjyen!

Thanks for the interest in our work, and we hope it can help inspire your own.

We do not make the Deep Face Lab (DFL) data available ourselves because we also paid for it from that patreon site. I do not know that they are available for free anywhere.

The bounding boxes for real and fake FF++ data are the same, because that allowed us to compute the difference between real and fake images between corresponding frames. The bounding boxes/landmarks therefore correspond to the real video, and we used the same bounding box to crop from the fake videos. If videos are of differing length, we use the first frames of both, then use every frame thereafter until we have reached the end of either video. If the real video has extra frames, we use them. If the fake video has extra frames, we cannot compute the manipulation mask for it, and do not use it.

The eye/mouth landmarks will not correspond to the fake videos in FF++ due to the mapping scheme above. But you should be able to use the cropped image to compute these landmarks if they are useful in your case.

I hope this answers your questions. Thanks again!

wheatdog commented 4 years ago

Hi @JStehouwer , thanks for your work.

I wonder how do you download "fake face" from https://www.patreon.com/ctrl_shift_face. I searched "data set" for posts, and from reading the description, they are collections of real faces. For example,

Can you point me to the right posts? Thanks!

JStehouwer commented 4 years ago

Hi wheatdog,

I am trying to get into contact with the guy that handled downloading the data from that source, but I'm having difficulty navigating thru that site to find which we used as well.

For Bowie, we have 2220 images in our dataset, so assuming automated face detection missed a few images, there should be a dataset for Bowie with a little more than 2220 images.

I will post again here when I have more information.

fjyen commented 4 years ago

@JStehouwer , thank you for replying.

I have another question when calculate the masks.

Do you provide the chosen CelebA images mapping numbers for generating StarGAN fake images?

According to the paper, there are 2000 images chosen from CelebA for generating StarGAN images.

The image numbers under stargan folder seem not to map CelebA directly and I couldn't find the mapping files in the provided dataset of update version.

Thanks!

yyssmm commented 1 year ago

Hi, @wheatdog Have you got the Deep Face Lab (DFL) data? If so, can you share a copy with me? Thanks！

JStehouwer / FFD_CVPR2020

questions about the dataset #4