Hi, this is the code for our KDD 2022 paper: Bilateral Dependency Optimization: Defending Against Model-inversion Attacks. Overview of MID framework vs. bilateral dependency optimization (BiDO) framework. BiDO forces DNNs to learn robust latent representations by minimizing $π(π,π_j)$ to limit redundant information propagated from the inputs to the latent representations while maximizing $π(π_j,π)$ to keep the latent representations informative enough of the label.
This code has been tested on Ubuntu 16.04/18.04, with Python 3.7, Pytorch 1.7 and CUDA 10.2/11.0
Download relevant datasets: CelebA, MNIST.
python prepare_data.py
The directory of datasets is organized as follows:
./attack_dataset
βββ MNIST
β βββ *.txt
β βββ Img
β βββ *.png
βββ CelebA
βββ *.txt
βββ Img
βββ *.png
You can also skip to the next section for defending against MI attacks with well-trained defense models.
# dataset:celeba, mnist, cifar;
# measure:COCO, HSIC;
# balancing hyper-parameters: tune them in train_HSIC.py
python train_HSIC.py --measure=HSIC --dataset=celeba
For KED-MI, if you trained a defense model yourself, you have to train an attack model (generative model) specific to this defense model.
# put your hyper-parameters in k+1_gan_HSIC.py first
python k+1_gan_HSIC --dataset=celeba --defense=HSIC
Here, we only provide the weights file of the well-trained defense models that achieve the best trade-off between model robustness and utility, which are highlighted in the experimental results.
GMI
Weights file (defense model / eval model / GAN) :
BiDO/target_model/
BiDO/target_model/celeba/HSIC/
GMI/eval_model/
GMI/result/models_celeba_gan/
Launch attack
# balancing hyper-parameters: (0.05, 0.5)
python attack.py --dataset=celeba --defense=HSIC
Calculate FID
# sample real images from training set
cd attack_res/celeba/pytorch-fid && python private_domain.py
# calculate FID between fake and real images
python fid_score.py ../celeba/trainset/ ../celeba/HSIC/all/
KED-MI
BiDO/target_model/mnist/COCO/
DMI/eval_model/
DMI/improvedGAN/celeba/HSIC/
DMI/improvedGAN/mnist/COCO/
# balancing hyper-parameters: (0.05, 0.5)
python recovery.py --dataset=celeba --defense=HSIC
# balancing hyper-parameters: (1, 50)
python recovery.py --dataset=mnist --defense=COCO
# celeba
cd attack_res/celeba/pytorch-fid && python private_domain.py
python fid_score.py ../celeba/trainset/ ../celeba/HSIC/all/ --dataset=celeba
# mnist
cd attack_res/mnist/pytorch-fid && python private_domain.py
python fid_score.py ../mnist/trainset/ ../mnist/COCO/all/ --dataset=mnist
VMI
To run this code, you need ~38G of memory for data loading. The attacking of 20 identities takes ~20 hours on a TiTAN-V GPU (12G).
# create a link to CelebA
cd VMI/data && ln -s ../../attack_data/CelebA/Img img_align_celeba
python celeba.py
VMI/clf_results/celeba/hsic_0.1&2/
VMI/3rd_party/InsightFace_Pytorch/work_space/save/
; Place evaluation classifer in VMI/pretrained/eval_clf/celeba/
VMI/pretrained/stylegan/neurips2021-celeba-stylegan/
# balancing hyper-parameters: (0.1, 2)
cd VMI
# x.sh (1st) path/to/attack_results (2nd) config_file (3rd) batch_size
./run_scripts/neurips2021-celeba-stylegan-flow.sh 'hsic_0.1&2' 'hsic_0.1&2.yml' 32
python classify_mnist.py --epochs=100 --dataset=celeba --output_dir=./clf_results/celeba/hsic_0.1&2 --model=ResNetClsH --measure=hsic --a1=0.1 --a2=2
If you find this code helpful in your research, please consider citing
@inproceedings{peng2022BiDO,
title={Bilateral Dependency Optimization: Defending Against Model-inversion Attacks},
author={Peng, Xiong and Liu, Feng and Zhang, Jingfeng and Lan, Long and Ye, Junjie and Liu, Tongliang and Han, Bo},
booktitle={KDD},
year={2022}
}
Some of our implementations rely on other repos. We want to thank the authors (MID, GMI, KED-MI, VMI) for making their code publicly available.π