Unofficial implementation of the MVSS-Net, which was proposed in ICCV 2021 by Chengbo Dong et al, includes code for training! This unofficial implementation is supported by the DICA Lab of Sichuan University.
The original repo is lacking the training code, links are here: OFFICIAL MVSS-Net link we tried our best to reproduce the result of the model.
ALERT: Although this method may seem like the current SOTA model, but the current results indicate that there are many doubts in this paper. We do not recommend using it as a replication model for the target or a model for learning when entering the field for the following reasons:
0.62
by reserchers all over the Internet, while far below the 0.7 in the paper.You could discuss all phenomenon and stand together with researchers through these links: Zhihu(a Chinese forum); Issues of the official repo.
Ubuntu 18.04.5 LTS
Python 3.9.15
PyTorch 1.10.0 + cuda11.1
Detail python librarys can found in requirements.txt
Prepare your datasets:
All dataset images is managed with a txt
file recording the path of images
, groundtruths
and edge masks
(if have) and image label
. This file follows the following format:
Each line of this file should contain 4 elements, separate with a space:
For example, each line in txt file should like this:
./Casiav2/authentic/Au_ani_00001.jpg None None 0
./Casiav2/tampered/Tp_D_CND_M_N_ani00018_sec00096_00138.tif ./Casiav2/mask/Tp_D_CND_M_N_ani00018_sec00096_00138_gt.png ./Casiav2/edge/Tp_D_CND_M_N_ani00018_sec00096_00138_gt.png 1
./Casiav2/tampered/Tp_D_CND_M_N_ani00018_sec00096_00138.tif ./Casiav2/mask/Tp_D_CND_M_N_ani00018_sec00096_00138_gt.png None 1
You should follow the format and generate your own "path file" in a xxxx.txt
.
Limits: At this time, the edge mask can only be generated during training and cannot be pre generated. This will be a little bit slow. Since every Epoch you will generate a edge mask for each image, however, they are always the same edge mask. Better choice should be generate the edge mask from the ground truth mask before start training. Script for pre-generate the edge mask will release later...
Then you could start to run this work, the main entrance is train_launch.py. Since torch.nn.parallel.DistributedDataParallel
is used, you need to use the following command to start the training:
torchrun \
--standalone \
--nproc_per_node=<Number of your GPU> \
train_launch.py \
--paths_file <your own path txt file>
You could use Tensorboard
to monitor the progress of the model during training. Logs should under ./save/
path.
Still on Working...
Still on Working...
Still on Working...
We trained for 200 epochs and used decay, and finally selected the best data for each epoch on each dataset.Here's the result.Please note that these are only approximate results and we did not make any further adjustments, so they should be taken as a reference only.\ CASIAv1: {'pixel_f1': 0.43, 'acc': 0.69, 'sen': 0.75, 'spe': 0.63, 'imagelevel_f1': 0.68, 'img_auc': 0.78, 'com_f1': 0.53, 'epoch': '11_end.pth'} {'pixel_f1': 0.4, 'acc': 0.74, 'sen': 0.55, 'spe': 0.95, 'imagelevel_f1': 0.7, 'img_auc': 0.8, 'com_f1': 0.51, 'epoch': '40_end.pth'}
COVERAGE: {'pixel_f1': 0.33, 'acc': 0.52, 'sen': 0.96, 'spe': 0.08, 'imagelevel_f1': 0.15, 'img_auc': 0.56, 'com_f1': 0.2, 'epoch': '11_end.pth'} {'pixel_f1': 0.13, 'acc': 0.56, 'sen': 0.59, 'spe': 0.53, 'imagelevel_f1': 0.56, 'img_auc': 0.59, 'com_f1': 0.21, 'epoch': '21_end.pth'} {'pixel_f1': 0.22, 'acc': 0.54, 'sen': 0.81, 'spe': 0.28, 'imagelevel_f1': 0.42, 'img_auc': 0.55, 'com_f1': 0.29, 'epoch': '7_end.pth'}
Columbia: {'pixel_f1': 0.44, 'acc': 0.66, 'sen': 0.98, 'spe': 0.36, 'imagelevel_f1': 0.52, 'img_auc': 0.84, 'com_f1': 0.48, 'epoch': '11_end.pth'} {'pixel_f1': 0.2, 'acc': 0.81, 'sen': 0.86, 'spe': 0.77, 'imagelevel_f1': 0.81, 'img_auc': 0.88, 'com_f1': 0.32, 'epoch': '35_end.pth'}
NIST16: {'pixel_f1': 0.2, 'acc': 0.66, 'sen': 0.66, 'spe': 0.0, 'imagelevel_f1': 0.0, 'img_auc': 0.0, 'com_f1': 0.0, 'epoch': '3_end.pth'}
If you want to train this Model with the CASIAv2 dataset, we provide a revised version of CASIAv2 datasets, which corrected several mistakes in the original datasets provided by the author. Details can find in the link shown below:
[1] Chen, X., Dong, C., Ji, J., Cao, J., & Li, X. (2021). Image Manipulation Detection by Multi-View Multi-Scale Supervision. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), 14165–14173. https://doi.org/10.1109/ICCV48922.2021.01392