SGTN: Privacy-Preserving Visual Content Tagging using Graph Transformer Networks

This project implements Privacy-Preserving Visual Content Tagging using Graph Transformer Networks.

Requirements

Please, install the following packages

numpy
pytorch (1.*)
torchnet
torchvision
tqdm

Download best checkpoints

SGTN on MS-COCO - checkpoint/coco/SGTN_N_86.6440.pth.tar (GDrive)
SGTN on PP-MS-COCO - checkpoint/coco/SGTN_A_85.5768.pth.tar (GDrive)

Performance

| Method | mAP | CP | CR | CF1 | OP | OR | OF1 | | :------------------ | :------: | :------: | :------: | :------: | :------: | :------: | :------: | | CNN-RNN | 61.2 | \- | \- | \- | \- | \- | \- | | SRN | 77.1 | 81.6 | 65.4 | 71.2 | 82.7 | 69.9 | 75.8 | | Baseline(ResNet101) | 77.3 | 80.2 | 66.7 | 72.8 | 83.9 | 70.8 | 76.8 | | Multi-Evidence | – | 80.4 | 70.2 | 74.9 | 85.2 | 72.5 | 78.4 | | ML-GCN | 82.4 | **84.4** | 71.4 | 77.4 | **85.8** | 74.5 | **79.8** | | SGTN | **86.6** | 77.2 | **82.2** | **79.6** | 76.0 | **82.6** | 77.2 | | ML-GCN (PP) | 80.3 | 84.6 | 68.1 | 75.5 | 85.2 | 72.4 | 78.3 | | SGTN (PP) | **85.6** | **85.3** | **75.3** | **79.9** | **85.3** | **78.7** | **81.8** | Performance comparisons on COCO and PP-COCO. SGTN outperforms baselines with large margins. PP denotes the use of anonymised dataset.

SGTN on COCO

python sgtn.py data/coco --image-size 448 --workers 8 --batch-size 32 --lr 0.03 --learning-rate-decay 0.1 --epoch_step 80 --embedding data/coco/coco_glove_word2vec.pkl --adj-dd-threshold 0.4 --device_ids 0

How to cite this work?

@inproceedings{Vu:ACMMM:2020,
    author = {Vu, Xuan-Son and Le, Duc-Trong and Edlund, Christoffer and Jiang, Lili and Nguyen, Hoang D.},
    title = {Privacy-Preserving Visual Content Tagging using Graph Transformer Networks},
    booktitle = {ACM International Conference on Multimedia},
    series = {ACM MM '20},
    year = {2020},
    publisher = {ACM},
    address = {New York, NY, USA}
}

Reference

This project is based on the following implementations:

ReML-AI / sgtn