hongshuochen / DefakeHop

Official code for DefakeHop: A Light-Weight High-Performance Deepfake Detector
https://arxiv.org/abs/2103.06929
69 stars 23 forks source link

Unable to train the model #4

Open wasim004 opened 2 years ago

wasim004 commented 2 years ago

Hey, It is a nice work and really appreciated.

I am getting an error when i run python model.py. May i ask you what could be the problem? Thanks!

Screenshot from 2021-12-08 15-44-53

hongshuochen commented 2 years ago

Please pull the newest version of the code! We have updated the code!

wasim004 commented 2 years ago

Thank you for the response, Where i can find that? Thanks!

wasim004 commented 2 years ago

I have fixed that issue, Now I am getting another issue. Can you please have a look at it? Thanks! Screenshot from 2021-12-10 15-56-36

hongshuochen commented 2 years ago

Do you use your own videos? This should be the expected output if you use my data.

==============================left_eye============================== ===============DefakeHop Prediction=============== ===============MultiChannelWiseSaab Transformation=============== Hop1 Input shape: (2708, 32, 32, 3) Output shape: (2708, 15, 15, 13)

wasim004 commented 2 years ago

Hi, Thanks for your kind response. Your output looks like:

Screenshot from 2021-12-13 17-04-01

The final result was like:

Screenshot from 2021-12-13 17-08-34

wasim004 commented 2 years ago

I used CelebDF-V1 dataset. I preprocessed the dataset, and create .npz files as per the requirements of the model. But I got this error. Screenshot from 2021-12-10 15-56-36

hongshuochen commented 2 years ago

Sorry for the late reply! Here is the data for Celeb-DF-v1! I follow the same code and get this data! And please update the model.py, I think there was a small error due to the change of the structure of this repo! https://drive.google.com/drive/folders/1nEBe5wGPmm2G1NsR46NK8msHiCUE9f8K?usp=sharing

Please let me know whether you can get the results or not! Thank you!

wasim004 commented 2 years ago

Thank you so much for your kind response!

Well, I already modified the mpdel.py, thanks. But I got the same error even on your data. Please have a look at it.

Thanks! Screenshot from 2021-12-16 12-56-24

hongshuochen commented 2 years ago

Could you help me check this part of code in model.py? test_images should be a 4D numpy array!

    for region in model.regions:
        path = 'data/' + region + '.test.npz'
        data = np.load(path)
        test_labels = data['labels']
        test_images = data['images']
        test_names = data['names']
        model.predict_region(region, test_images, test_names)
wasim004 commented 2 years ago

Thank you for the response..!

Sure, you may have a look at it. Thanks! Screenshot from 2021-12-16 17-37-59

hongshuochen commented 2 years ago

Line169 and 170 are wrong And please load the testing data by np.load since we made some changes for the structure of this repo

wasim004 commented 2 years ago

Thank You so much..!! Finally solved the issue.

But still have to run some other datasets and I am unable to do that due to less memory of the system. Anyway thank you for your time and help. Will contact you if faced any other issue.

Thanks!

DeepDetector commented 2 years ago

Hey,I extract feature data(celeb-v2) with shape (761334,32,32,3), it will out of memory when i run model.py So how do you run the large dataset? Thanks!

wasim004 commented 2 years ago

Hey,I extract feature data(celeb-v2) with shape (761334,32,32,3), it will out of memory when i run model.py So how do you run the large dataset? Thanks!

Hey, I am having the same issue and I asked one of the author of the paper and she said I can divide the data in chunks and train the model. But I am trying to arrange the resources for me, as I believe that it will effect the performance of the model and our results wouldn't be comparable to this work or others due to different circumstances. Thanks!

hongshuochen commented 2 years ago

Hi! Thank you for your questions! It is a really great question! I will update the code for this problem! The idea is that we subsample the dataset and we use subset to train the Saab transform. For prediction, the whole dataset is used by dividing the dataset to many chunks and predict each chunks one by one! For Saab transform, we could use a subset to get the kernels which is demonstrated that it will get similar kernel when the number of samples is large. For XGBoost, we still use all the samples to train! I will update the code as soon as possible!

DeepDetector commented 2 years ago

Thank you for the response! And looking forward to your update~

hongshuochen commented 2 years ago

Solved! Please try the new saab.py!

wasim004 commented 2 years ago

Hi, I tried to run it but this time the issue seems different. Thanks!

Screenshot from 2021-12-30 17-02-59

Solved! Please try the new saab.py!

hongshuochen commented 2 years ago

Did you update the saab.py? Could you show me the fit in saab.py?

DeepDetector commented 2 years ago

Thank you!The new saab.py work. But this time the issue is xgboost,it still need much memory. K1_D4{9 }~J8)U)~P($JN6Q

hongshuochen commented 2 years ago

[]() Change gpu_hist to hist

DeepDetector commented 2 years ago

[](

https://github.com/hongshuochen/DefakeHop/blob/941efb6a3d11b59bf0c4d56c95f75612a9f4da4e/defakeHop.py#L155

) Change gpu_hist to hist

Thank you for your response! But now I hava the same issue as @wasim004 I updata saab.py and only modify file path. It will be killed at right eye region with shape(134069,32,32,3)

hongshuochen commented 2 years ago

https://github.com/hongshuochen/DefakeHop/blob/941efb6a3d11b59bf0c4d56c95f75612a9f4da4e/saab.py#L38

Could you check which line in fit you program get killed?

wasim004 commented 2 years ago

https://github.com/hongshuochen/DefakeHop/blob/941efb6a3d11b59bf0c4d56c95f75612a9f4da4e/saab.py#L38

Could you check which line in fit you program get killed?

Hi, I updated saab.py but still getting same error. Screenshot from 2022-01-01 16-43-05

DeepDetector commented 2 years ago

Hey,i guess the issue is still about out of memory.I entered three shapes: 1) (100000,32,32,3): It work! 2) (134069,32,32,3): Killed. 3) (761334,32,32,3): MemoryError: Unable to allocate array with shape (761334,32,32,3). The error in saab.py at output = np.zeros((N, H, W, n_channels), dtype="float64")

hongshuochen commented 2 years ago

Please change all "float64" to "float32" in saab.py! And change the batch size from 50000 to 10000! https://github.com/hongshuochen/DefakeHop/blob/941efb6a3d11b59bf0c4d56c95f75612a9f4da4e/saab.py#L111

wasim004 commented 2 years ago

Hi @hongshuochen,

I've finally managed to run and train the model and got the results. I trained the model both on the original and modified code of saab.py for CelebDF-v2 dataset. I got the results below:

Screenshot from 2022-01-05 15-43-38 Screenshot from 2022-01-05 15-44-11

In your original paper the results are different than what I've got for the CelebDF-v2 dataset. Do you've any idea what could be the possible reason?

wasim004 commented 2 years ago

Also I train the model on CelebDF-v1. I have got the following results:

My Data: Frame(0.7971), Video(0.8453) Yours Data: Frame(0.9138), Video(0.9363)

hongshuochen commented 2 years ago

The reason for this is that you change from "float64" to "float32" If you run with "float32" you can get the results that I get!

wasim004 commented 2 years ago

The reason for this is that you change from "float64" to "float32" If you run with "float32" you can get the results that I get!

Hi, Thanks for the response. I did not changed anything. I run your code as it is, because I find out a server to run your code so I don't have to modify anything in the code. Thanks!

hongshuochen commented 2 years ago

Hi @wasim004 I think you only use 2 regions, right? Please use 3 regions! I reclone the repo and download the data from https://drive.google.com/drive/u/1/folders/1nEBe5wGPmm2G1NsR46NK8msHiCUE9f8K I just run the Celeb-DF-v1, this is the result I get! image

wasim004 commented 2 years ago

Hi, Still got the same results. Screenshot from 2022-01-05 17-47-36

hongshuochen commented 2 years ago

Can you check this line? Your features shape should be (6065,540) instead of 360 https://github.com/hongshuochen/DefakeHop/blob/941efb6a3d11b59bf0c4d56c95f75612a9f4da4e/model.py#L153

wasim004 commented 2 years ago

Hi, The actual problem is that my input shape is 5190 and your is 75242. Also the data I get after running data.py is 500 MB in size and your is 783MB.

So, I tried to re-extract the landmarks and patches and followed the same steps but again I got the same data size and shape. My CelebDF-V1 dataset contains Test(32(Real)+159(Fake))+Train(126(Real)+639(Fake)) videos. My CelebDF-V2 dataset contains Test(118(Real)+1128(Fake))+Train(472(Real)+4511(Fake)) videos.

Can you please confirm your datasets sizes? Thanks!

wasim004 commented 2 years ago

Hi, After including the mouth region the results on your data are now fine.

Screenshot from 2022-01-05 18-51-55

Thanks!