Overfitting on CASIA-v2

yelusaleng / RRU-Net

Official repository for "RRU-Net: The Ringed Residual U-Net for Image Splicing Forgery Detection" (CVPRW 2019)

107 stars 18 forks source link

Overfitting on CASIA-v2 #26

Open soroushhashemifar opened 2 years ago

soroushhashemifar commented 2 years ago

Hi. I tried to finetune your RR-Unet model on CASIA-v2 and CoMoFoD datasets through your repository, but the model overfits. Is this normal?

yelusaleng commented 2 years ago

hi, what are your training and testing datasets, respectively?

soroushhashemifar commented 2 years ago

Thanks for your response. I have chunked the tampered images of CASIA-v2 into train and val splits with the val ratio of 10%. Around 4000 images for training and 1000 images for validation. Same procedure is applied to CoMoFoD in another experiment. In both cases, it overfits the dataset and the val loss curve starts to increase gradually after 5th epoch. My loss function is BCEwithLogits and Adam optimizer is used with LR=1e-3 with LR scheduler.

soroushhashemifar commented 2 years ago

I also have an integrated dataset of multiple forgery datasets with more than 16000 tampered images. RR-Unet overfits on this dataset as well.

yelusaleng commented 2 years ago

I think your case is underfitting, not overfitting. This case is unnormal since RRU-Net couldn't even demonstrate a good performance on the training and testing sets of CASIA-2.

yelusaleng commented 2 years ago

please share the val loss curve on CASIA-2 with me.

soroushhashemifar commented 2 years ago

Training Process for lr-0 001

Maybe that's the case happened for me. I have attached the curves above

yelusaleng commented 2 years ago

this figure looks normal, it's not ovefitting or underfitting.

soroushhashemifar commented 2 years ago

Hello again. The training on the large dataset is finished and here is the result:

Training Process for lr-0 001

The dataset is a mixture of CASIA-v2, IMD2020, and CoMoFod with about 15000 training samples and 1000 validation ones. The resulting model is not able to detect even simple forgeries. Do you have any idea what's going on? (I didn't change your configuration and just changed the dataset)

yelusaleng commented 2 years ago

according to the figure, i think the model didn't learn useful information from the mixture dataset. i want to know how you handle the problem that different size for CASIA, IMD 2020 and COMOFOD.