Generalization on new data!

mjkwon2021 / CAT-Net

Official code for CAT-Net: Compression Artifact Tracing Network. Image manipulation detection and localization.

210 stars 25 forks source link

Generalization on new data! #31

Open HamzaJavaid-gh opened 1 year ago

HamzaJavaid-gh commented 1 year ago

Hi!

Thanks for sharing this code. I've a question about training Cat-Net on custom dataset related to forgery which is equivalent to IMD2020 in terms of size.

While training Cat-Net on the custom data, I've observed that while using pre-trained weights/training from scratch, to train on new data the performance keeps on decreasing and the model gets overfitted (val_loss keeps on increasing and train_loss keeps on decreasing).

Maybe we should fine tune it instead of training all the network but even with a very low learning rate the metrics are not getting stable.

Can you explain (I've not specifically found this in paper), on how many datasets your original model is trained and can you suggest how we can train CatNet in order to generalize well on custom dataset.

CauchyComplete commented 1 year ago

The datasets used for training can be found in the paper or the code: link I cannot say for sure why your model cannot train well because I have no information about your datasets. One possible reason is that the dataset may not contain enough compression artifacts.

HamzaJavaid-gh commented 1 year ago

Thank you so much for sharing the data details!

Just want to have your opinion on this, I tested the pre-trained models with 5-10 images and all the streams (RGB, DCT and even full) worked really well in these images.

When I used the same images in the training and validation, just to overfit the network on this small set of images by using the same pre-trained weights (that were performing really well). The network was not able to converge at all. The performance of the updated weights was not even close to your original pre-trained model results.

CauchyComplete commented 1 year ago

Probably the learning rate might be too large. The learning rate decreases substantially as the training progresses. When you try to fine-tune the pretrained model, the learning rate should not be the same value as the one that is set to train the randomly initialized weights which are too large.