I got an error while training Xception net on ffpp

Saleh-Gholam-Zadeh commented 3 years ago

Hello,

I used the command below to train Xception net on faceforensics++ dataset which contains "youtube" and "actors" as real videos and "DeepFakeDetection", "Deepfakes", "Face2Face", "FaceShifter" , "FaceSwap", "NeuralTextures" as fake videos.

first of all I ran index_ffp.py and produced ffpp_videos.pkl In the next step I ran python extract_faces.py \ --source path/to/faceforensics++/dataset \ --videodf ./data/ffpp_videos.pkl \ --facesfolder ./output_faces \ --facesdf ./output_faces_df \ --checkpoint ./tmp and 2 things are created: 1) output_faces folder with all extracted frames inside 2) a pickle file "output_faces_df_from_video_0_to_video_0.pkl" (whose name was wierd a bit and it was 40 MB) was created inside the project folder (beside extract_faces.py) . I created a folder "output_faces_df" manually and put the pickle file inside this folder

In the last step I ran

python train_binclass.py --net Xception --traindb ff-c23-720-140-140 --valdb ff-c23-720-140-140 --ffpp_faces_df_path ./output_faces_df/output_faces_df_from_video_0_to_video_0.pkl --ffpp_faces_dir ./output_faces --face scale --size 224 --batch 32 --lr 1e-5 --valint 500 --patience 10 --maxiter 30000 --seed 41 --attention --device 0

however Im getting the below error:

/home/saleh/anaconda3/envs/icpr2020/bin/python /home/saleh/Documents/internship/icpr2020dfdc/train_binclass.py --net Xception --traindb ff-c23-720-140-140 --valdb ff-c23-720-140-140 --ffpp_faces_df_path ./output_faces_df/output_faces_df_from_video_0_to_video_0.pkl --ffpp_faces_dir ./output_faces --face scale --size 224 --batch 32 --lr 1e-5 --valint 500 --patience 10 --maxiter 30000 --seed 41 --attention --device 0 Parameters {'face': 'scale', 'net': 'Xception', 'seed': 41, 'size': 224, 'traindb': 'ff-c23-720-140-140'} Tag: net-Xception_traindb-ff-c23-720-140-140_face-scale_size-224_seed-41 Loading data Traceback (most recent call last): File "/home/saleh/anaconda3/envs/icpr2020/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2898, in get_loc return self._engine.get_loc(casted_key) File "pandas/_libs/index.pyx", line 70, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/index.pyx", line 101, in pandas._libs.index.IndexEngine.get_loc File "pandas/_libs/hashtable_class_helper.pxi", line 1675, in pandas._libs.hashtable.PyObjectHashTable.get_item File "pandas/_libs/hashtable_class_helper.pxi", line 1683, in pandas._libs.hashtable.PyObjectHashTable.get_item KeyError: 'source'

The above exception was the direct cause of the following exception:

Traceback (most recent call last): File "/home/saleh/Documents/internship/icpr2020dfdc/train_binclass.py", line 460, in main() File "/home/saleh/Documents/internship/icpr2020dfdc/train_binclass.py", line 227, in main dbs={'train': train_datasets, 'val': val_datasets}) File "/home/saleh/Documents/internship/icpr2020dfdc/isplutils/split.py", line 107, in make_splits split_df = get_split_df(df=full_df, dataset=split_db, split=split_name) File "/home/saleh/Documents/internship/icpr2020dfdc/isplutils/split.py", line 57, in get_split_df df[(df['source'] == 'youtube') & (df['quality'] == crf)]['video'].unique()) File "/home/saleh/anaconda3/envs/icpr2020/lib/python3.6/site-packages/pandas/core/frame.py", line 2906, in getitem indexer = self.columns.get_loc(key) File "/home/saleh/anaconda3/envs/icpr2020/lib/python3.6/site-packages/pandas/core/indexes/base.py", line 2900, in get_loc raise KeyError(key) from err KeyError: 'source'

Process finished with exit code 1

The error come from dataframe, so tried to debug the code and I saw the dataframe was successfully loaded but it doesnt have a column with 'source' name as an example I printed a sample row of the dataframe and it has the following keys:

df.iloc[19000] Out[3]: video 607 label False videosubject 0 kp1x 189 kp1y 170 kp2x 332 kp2y 164 kp3x 271 kp3y 227 kp4x 271 kp4y 315 kp5x 113 kp5y 226 kp6x 398 kp6y 211 conf 0.914732 left 29 top 0 right 483 bottom 487 nfaces 1 Name: original_sequences/youtube/c23/videos/671.mp4/fr132_subj0.jpg, dtype: object

It seems that the 'source' should be 'youthube', however we dont have such a key in the dataframe

So what should I do

Saleh-Gholam-Zadeh commented 3 years ago

let me add that I did more debug and found out that 'source' key exists in video_dataframe (not faces_dataframe), but we don't pass it as an argument to the train_binclass.py and we pass faces_dataframe path to the train_binclass script

CrohnEngineer commented 3 years ago

Hey @Saleh-Gholam-Zadeh ,

thanks for the feedback, I think you found out a bug in the code that we didn't catch during our review. Specifically, in the function process_video inside extract_faces.py, some of the columns of the original df_videos (among which there is the "source" you found out debugging) are not copied in the resulting DataFrame of faces. However, they are used during training for creating the train/val/test splits indicated in the paper (that's why you found out the bug only when running train_binclass.py). I'm fixing this in the next commit.

Again, thank you for pointing out 💯

Edoardo

CrohnEngineer commented 3 years ago

This committ should solve your problem, feel free to write us again/reopen the issue if it doesn't work for you. In case we don't talk again, happy new year :)

Edoardo

Saleh-Gholam-Zadeh commented 3 years ago

Thanks a lot, I looked at your last commit, I had done excatly the same change as you did. I'm trying to train your model on the faceshifter dataset, which has been released recently and can be accessed with the same procedure as FF++ dataset, i.e, it is added as a new category beside the other methods such as Face2Face, FaceSwap, ...

In case you have already trained your model on faceshifter can you please let me know about the detection performance ? Happy new year and wish you all the best Saleh

CrohnEngineer commented 3 years ago

Hey @Saleh-Gholam-Zadeh ,

I looked at your last commit, I had done excatly the same change as you did

Great, that's good to know :)

I'm trying to train your model on the faceshifter dataset, which has been released recently and can be accessed with the same procedure as FF++ dataset, i.e, it is added as a new category beside the other methods such as Face2Face, FaceSwap, ... In case you have already trained your model on faceshifter can you please let me know about the detection performance ?

we didn't know about Faceshifter, so unfortunately we don't have detection results on it... Thank you anyway for letting us know! It would be surely interesting to check it out :) In any case, if you're interested in cross-dataset performances, we have done some other tests on Celeb-Df2. You can find the results here https://arxiv.org/abs/2011.07792 . Hope can help you!

Edoardo

Saleh-Gholam-Zadeh commented 3 years ago

Thank you so much, It is very useful.

I have another question. I attached 2 plots. In one of them I used the whole ffpp dataset, there fore it is umbalanced and in the other one I used the whole real videos and only one category of fake videos. I just want to know how do you tackle with the unbalancing problem, since in the paper in the figure 6 I can see all diagonal plots are based on a balanced dataset, however ffpp dataset is not balanced. another important thing is to set a correct treshold. In the unbalanced plot that I attached correct treshold for discriminating real from fake is not 0 anymore. Figure_2 figure_4

Saleh-Gholam-Zadeh commented 3 years ago

Finally my last question is that in the code split.py line 54 to 68 you considered "youtube" videos as original (which is okay), however it is not the only originall category, beside "youtube" there is "actors" dataset (which is used in your second paper as I saw). I just want to make sure you considered it in the current version of your code. Thanks a lot

real_categories

CrohnEngineer commented 3 years ago

Hey @Saleh-Gholam-Zadeh ,

I have another question. I attached 2 plots. In one of them I used the whole ffpp dataset, there fore it is umbalanced and in the other one I used the whole real videos and only one category of fake videos. I just want to know how do you tackle with the unbalancing problem, since in the paper in the figure 6 I can see all diagonal plots are based on a balanced dataset, however ffpp dataset is not balanced.

We tackled the imbalance problem by constructing balanced batches during training, i.e. every batch had an even number of elements with an equal number of FAKE and REAL samples. However, please be aware that we never processed the entire dataset during training, meaning that we never completed an entire epoch. The reason behind this choice is twofold:

Even considering just 32 frames per video, for FF++ you end up with 1.6 million frames, and processing the entire dataset would be really time consuming;
The network usually overfitted (at least in our experiments) a lot earlier than half way of the entire dataset processed.

You can find all the details about this on page 5 of the paper.

another important thing is to set a correct treshold. In the unbalanced plot that I attached correct treshold for discriminating real from fake is not 0 anymore.

Who says that the correct threshold should be 0? :) I'm just kidding, but please be aware that the problem of choosing a correct threshold is not so simple nor intuitive. Ideally, we would want a binary classification algorithm to predict samples of one class with a score which is below/above half of the final classification score range, i.e. in our case with a sigmoid score > or < than 0.5 supposing that a score = 1 predicts perfectly a FAKE sample and a score = 0 predicts perfectly a REAL sample (please notice that the final score returned by our networks is not normalized! I can see from your plots that you didn't normalize it, it is not a big deal but be aware of it). However, such an idealization seldomly works in real-case scenarios, and at least in the multimedia forensics literature that I have read so far the most common approach when evaluating a detection algorithm is to resort to Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) metric. This measure is extremly useful, especially when comparing different classification algorithms (like we have done in our work), as it does not impose any threshold on the final classification score to retrieve an accuracy measure, yet the AUC still gives some information on the overall performance of the algorithms. It is far from perfect, but personally I think it is fairer than computing accuracies with a random threshold :) In any case, there are some methods to compute an optimal threshold from the ROC curve, you can find some hints here.

Finally my last question is that in the code split.py line 54 to 68 you considered "youtube" videos as original (which is okay), however it is not the only originall category, beside "youtube" there is "actors" dataset (which is used in your second paper as I saw). I just want to make sure you considered it in the current version of your code.

You're right! Not 100% sure about it, but I think we didn't include it for the same reason listed above (usually the network overfitted with just the REAL samples from the "youtube" category). I'll check on this and eventually fix it, thanks again for the feedback, really appreciated it :) 💪 💪

Edoardo

polimi-ispl / icpr2020dfdc

I got an error while training Xception net on ffpp #12