universome / stylegan-v

[CVPR 2022] StyleGAN-V: A Continuous Video Generator with the Price, Image Quality and Perks of StyleGAN2
https://universome.github.io/stylegan-v
333 stars 36 forks source link

A question on the cropping of FaceForensics dataset #9

Closed johannwyh closed 2 years ago

johannwyh commented 2 years ago

Hello!

I have a question about the cropping of FaceForensics dataset. I realize that the cropping strategy is different from FFHQ and ArcFace. I want to ask that is your implementation in preprocess_ffs.py equivalent to that of MoCoGAN-HD and TGAN-v2?

Do you use the wide-crop or not?

Thanks!

universome commented 2 years ago

Hi! Our implementation of cropping is equivalent to the one from MoCoGAN-HD and TGAN-v2: we use cropping bounding boxes provided by the dataset. We do not use wide_crop and that's why FFS is an "unstable" dataset in this sense: its zoom changes too much with face movements. We tried wide_crop at some point of the project, hoping to obtain better visual results (forcing the model to learn face "shaking" unnecessarily spends its capacity), but it didn't help much and you also can't directly compare to the baselines if you use it.

Here is how generations looked for us at some point of the project development on the FFS dataset with wide crops (the motion was broken at that time for our model, this is why the motions on the videos look so weird): https://user-images.githubusercontent.com/3128824/168465613-7ec6e158-b529-40e5-acc4-48b479257446.mp4

johannwyh commented 2 years ago

Many thanks. I have no further questions.

This issue can be closed.