mapooon / SelfBlendedImages

[CVPR 2022 Oral] Detecting Deepfakes with Self-Blended Images https://arxiv.org/abs/2204.08376
Other
185 stars 32 forks source link

The AUC of FF++(c23) #12

Open WeinanGuan opened 1 year ago

WeinanGuan commented 1 year ago

I run the inference on Celeb-DF-v2, and FF++ (c23) with your pre-trained model. I get the similar results on the former (AUC:0.9381, AP:0.9669). However, the video-level AUC on the latter is only 0.9051, and AP is 0.9797. I also separately test the performance on Deepfakes, FaceSwap, Face2Face, NeuralTextures and FaceShifter. The results are as follows: DF-AUC:0.9856/AP:0.9900; FS-AUC:0.9698/AP:0.9747; F2F-AUC:0.9094/AP:0.9192; NT-AUC:0.8254/AP:0.8427; FSh-AUC:0.8351/AP:0.8427. Except for Deepfakes, other performance has some difference with that your reported in the supplementary. Do you run the inference on FF++ c23 dataset? I do not think the c23 videos can result the significant performance drop. Would you like to provide some suggestions to solve this problem?

LOOKCC commented 1 year ago

My results on c23 are very similar to yours. Reporting the results of c23 in the paper are the consensus in the field of Deepfake Detection, and the author did not do so.

YU-SHAO-XU commented 1 year ago

@Vodka70 would you tell me how to test FF++ dataset ? should we revise the code in inference._dataset.py !?

WeinanGuan commented 1 year ago

Yes, you should revise the code of data path in inference.dataset

YU-SHAO-XU commented 1 year ago

Hi

Did you train from scratch by using resnet34/efficient b4 then test on FF++ or CDF ??my result aren't as good as paper

many thanks

Vodka70 @.***> 於 2022年10月19日 週三 凌晨1:40寫道:

Yes, you should revise the code of data path in inference.dataset

— Reply to this email directly, view it on GitHub https://github.com/mapooon/SelfBlendedImages/issues/12#issuecomment-1282773085, or unsubscribe https://github.com/notifications/unsubscribe-auth/AYVDSKCZ2HO7LQTSCV6FKXLWD3OINANCNFSM55KCAO4A . You are receiving this because you commented.Message ID: @.***>

WeinanGuan commented 1 year ago

@YU-SHAO-XU Sorry for the late reply. I re-trained from scratch by ImageNet pretrained efficientnet-b4. And test on FF++ and CDF. My results aren't also as good as those in the paper. FF-AUC: 0.9936, FF-DF-AUC:0.9999, FF-FS-AUC:0.9986, FF-F2F-AUC:0.9979, FF-NT-AUC:0.9780(which is obviously weaker than the performance reported in the paper). Besides, my CDF-AUC is only 0.8995, which is not as good as the reported performance (0.9382). @mapooon Would you like to provide some suggestions about this? Is tihs a normal experiment results due to some random factors? Many thanks~

mapooon commented 1 year ago

Try to install packages with requirements.txt we released at: https://drive.google.com/file/d/14ZjiBFG825qP8ZKE_XGzQ6yob4XHx_Ml/view?usp=share_link After executing the docker, you can use it as follows: pip install -r requirements.txt. Please let me know if you success in reproducing the results. We will update the repository.

YU-SHAO-XU commented 1 year ago

I install packages in requirements.txt one by one by efficientnet-b4 and the result is 99.36 (FF++) and 90.06 (CDF) respectively. How about you @Vodka70 ?! batch size=16, CDF AUC 92.1...

githuboflk commented 1 year ago

I used the weights provided by the author to test on the DFDC image data set, and ACC was extremely poor.@YU-SHAO-XU

YU-SHAO-XU commented 1 year ago

@githuboflk I used the weights provided by the author is as good as paper say. If i train from scratch, the result is 99.36 (FF++) and 90.06 (CDF) respectively.

mapooon commented 1 year ago

Thanks to an enthusiastic collaborator, we have found that there is a bug in crop_dlib_ff.py. And we have just fixed the bug, so please try again from the point of executing preprocessing.

angelalife commented 1 year ago

After modifying the num_frames error of crop_dlib_ff.py and retraining the model, the AUC of CDF still did not reach the 0.9318 reported in the paper, but 0.9123. Is there anything else that needs to be modified? In addition, it is found that many of the faces recognized by retinaface are faces in the background. Will this problem lead to low test results? @mapooon

Blosslzy commented 11 months ago

Would anyone happen to be conducting cross-manipulation evaluation on FF++(c40)? I obtained results that were not as favorable as I had hoped. I would greatly appreciate any assistance or insights.

lihanzhe commented 6 months ago

I run the inference on Celeb-DF-v2, and FF++ (c23) with your pre-trained model. I get the similar results on the former (AUC:0.9381, AP:0.9669). However, the video-level AUC on the latter is only 0.9051, and AP is 0.9797. I also separately test the performance on Deepfakes, FaceSwap, Face2Face, NeuralTextures and FaceShifter. The results are as follows: DF-AUC:0.9856/AP:0.9900; FS-AUC:0.9698/AP:0.9747; F2F-AUC:0.9094/AP:0.9192; NT-AUC:0.8254/AP:0.8427; FSh-AUC:0.8351/AP:0.8427. Except for Deepfakes, other performance has some difference with that your reported in the supplementary. Do you run the inference on FF++ c23 dataset? I do not think the c23 videos can result the significant performance drop. Would you like to provide some suggestions to solve this problem?

hi, have you solved this problem? Can you tell me the reason for this problem?

lihanzhe commented 6 months ago

The authors' newly published model does not address this issue either

Elahe-sd commented 1 month ago

I used the weights provided by the author, extracted the last fully connected, and trained a classifier as simple as it was in the rest of the original network, but the results for the CDF does not go above 91.90, Would you help me with this problem ?

lihanzhe commented 1 month ago

@Elahe-sd Is your batchsize 32? A low batchsize may result in a decrease in the auc.

Elahe-sd commented 1 month ago

@Elahe-sd Is your batchsize 32? A low batchsize may result in a decrease in the auc.

Actually, it was 16, then, after your suggestion, I changed it and checked, it is still the same