Comparing Different Membership Inference Attacks with a Comprehensive Benchmark
Membership inference attacks pose a significant threat to user privacy in machine learning systems. While numerous attack mechanisms have been proposed in the literature, the lack of standardized evaluation parameters and metrics has led to inconsistent and even conflicting comparison results. To address this issue and facilitate a systematic analysis of these disparate findings, we introduce MIBench, a comprehensive benchmark for membership inference attacks. MIBench includes a suite of carefully designed evaluation scenarios and evaluation metrics to provide a consistent framework for assessing the efficacy of various membership inference techniques. The evaluation scenarios are crafted to encompass four critical factors: intra-dataset distance distribution, inter-sample distance within the target dataset, differential distance analysis, and inference withholding ratio. In total, MIBench includes ten typical evaluation metrics and incorporates 84 distinct evaluation scenarios for each dataset. Using this robust framework, we conducted a thorough comparative analysis of 15 state-of-the-art membership inference attack algorithms across 588 evaluation scenarios, 7 widely adopted datasets, and 7 representative model architectures. Our analysis revealed 83 instances of Conflicting Comparison Results (CCR), providing substantial evidence for the CCR Phenomenon. We identified two CCR types: Type 1 (single-factor) and Type 2 (dual-factor). The distribution of CCR instances across the four critical factors was: inter-sample distance (40.96%), differential distance (37.35%), inference withholding ratio (19.28%), and intra-dataset distance (2.41%). All codes and evaluations of MIBench are publicly available in the following link1.
MI attacks:
Datasets: CIFAR100, CIFAR10, CH_MNIST, ImageNet, Location30, Purchase100, Texas100
Models: MLP, StandDNN, VGG16, VGG19, ResNet50, ResNet101, DenseNet121
Requirements: You can run the following script to configurate necessary environment sh ./sh/install.sh
Usage: Please first to make a folder for record, all experiment results with save to record folder as default. And make folder for data to put supported datasets. XXX XXX
Attack: This is a demo script of running NN_attack on CIFAR100. python ./attack/NN_attack.py --yaml_path ../config/attack/NN/CIFAR100.yaml --dataset CIFAR100 --dataset_path ../data --save_folder_name CIFAR100_0_1
Selected attacks:
Evaluation Framework:
MIBench is a comprehensive benchmark for comparing different MI attacks, which consists not only the evaluation metric module, but also the evaluation scenario module.
In this work, we have designed and implemented the MIBench benchmark with 84 evaluation scenarios for each dataset. In total, we have used our benchmark to fairly and systematically compare 15 state-of-the-art MI attack algorithms across 588 evaluation scenarios, and these evaluation scenarios cover 7 widely used datasets and 7 representative types of models.
(a) Evaluation Scenarios of CIFAR100.
(b) Evaluation Scenarios of CIFAR10.
(c) Evaluation Scenarios of CH_MNIST.
(d) Evaluation Scenarios of ImageNet.
(e) Evaluation Scenarios of Location30.
(f) Evaluation Scenarios of Purchase100.
(g) Evaluation Scenarios of Texas100.
Part II: Evaluation Metrics
We mainly use attacker-side accuracy, precision, recall, f1-score, false positive rate (FPR), false negative rate (FNR), membership advantage (MA), the Area Under the Curve (AUC) of attack Receiver Operating Characteristic (ROC) curve, TPR @ fixed (low) FPR, threshold at maximum MA, as our evaluation metrics. The details of the evaluation metrics are shown as follows.
(a) accuracy: the percentage of data samples with correct membership predictions by MI attacks;
(b) precision: the ratio of real-true members predicted among all the positive membership predictions made by an adversary;
(c) recall: the ratio of true members predicted by an adversary among all the real-true members;
(d) f1-score: the harmonic mean of precision and recall;
(e) false positive rate (FPR): the ratio of nonmember samples are erroneously predicted as members;
(f) false negative rate (FNR): the difference of the 1 and recall (e.g., FNR=1-recall);
(g) membership advantage (MA):the difference between the true positive rate and the false positive rate (e.g., MA = TPR - FPR);
(h) Area Under the Curve (AUC): computed as the Area Under the Curve of attack Receiver Operating Characteristic (ROC);
(i) TPR @ fixed (low) FPR: an attack’s truepositive rate at (fixed) low false-positive rates;
(j) threshold at maximum MA: a threshold to achieve maximum MA.
Results:
The results section consists of three parts: the results of 84 evaluation scenarios (ES), the thresholds at maximum MA of the Risk score and Shapley values attacks and the results of 4 research questions (RQ). And in part I and part III, we identify the evaluation results of 15 state-of-the-art MI attacks by ten evaluation metrics (e.g., attacker-side accuracy, precision, recall, f1-score, FPR, FNR, MA, AUC, TPR @ fixed (low) FPR (T@0.01%F and T@0.1%F), threshold at maximum MA).
1. Distillation-based:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
2. Calibrated Score:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Purchase100:
(6) Texas100:
3. Label-only:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
4. NN_attack:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
5. PPV:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
6. Risk score:
(1) CIFAR100:
(2) CH_MNIST:
(3) ImageNet:
(4) Location30:
(5) Purchase100:
(6) Texas100:
7. Shapley values:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
8. Top1_Threshold:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
9. BlindMI-1CLASS:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
10. Top3_NN:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
11. LiRA:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
12. Top2+True:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
13. BlindMI-w:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
14. BlindMI-without:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) Location30:
(5) Purchase100:
(6) Texas100:
15. Loss-Threshold:
(1) CIFAR100:
(2) CIFAR10:
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
1. Risk score attacks:
(1) CIFAR100:
(2) CH_MNIST:
(3) ImageNet:
(4) Location30:
(5) Purchase100:
(6) Texas100:
2. Shapley values attacks:
(1) CIFAR100:
(2) CIFAR10:
<img width="2179" alt="CIFAR10_Shapley values_不同类别_阈值_03" src="https://github.com/MIBench/MIBench.github.io/assets/124696836/65f3bd5a-6797-406c-ac17-c46eb044f64f">
(3) CH_MNIST:
(4) ImageNet:
(5) Location30:
(6) Purchase100:
(7) Texas100:
(1) CIFAR100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: CIFAR100_Normal + 2.893 + 0.085 + 20%
ES29: CIFAR100_Uniform + 2.893 + 0.085 + 20%
ES57: CIFAR100_Bernoulli + 2.893 +0.085 + 20%
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: CIFAR100_Normal + 2.893 + 0.085 + 40%
ES10: CIFAR100_Normal + 3.813 + 0.085 + 40%
ES22: CIFAR100_Normal + 4.325 + 0.085 + 40%
RQ3: Effect of Differential Distances between two datasets
ES03: CIFAR100_Normal + 2.893 + 0.085 + 45%
ES05: CIFAR100_Normal + 2.893 + 0.119 + 45%
ES07: CIFAR100_Normal + 2.893 + 0.157 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES37: CIFAR100_Uniform + 3.813 + 0.085 + 20%
ES38: CIFAR100_Uniform + 3.813 + 0.085 + 40%
ES39: CIFAR100_Uniform + 3.813 + 0.085 + 45%
ES40: CIFAR100_Uniform + 3.813 + 0.085 + 49%
(2) CIFAR10:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES13: CIFAR10_Normal + 2.501 + 0.213 + 20%
ES41: CIFAR10_Uniform + 2.501 + 0.213 + 20%
ES69: CIFAR10_Bernoulli + 2.501 + 0.213 + 20%
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: CIFAR10_Normal + 1.908 + 0.155 + 40%
ES10: CIFAR10_Normal + 2.501 + 0.155 + 40%
ES22: CIFAR10_Normal + 3.472 + 0.155 + 40%
RQ3: Effect of Differential Distances between two datasets
ES51: CIFAR10_Uniform + 3.472 + 0.155 + 45%
ES53: CIFAR10_Uniform + 3.472 + 0.213 + 45%
ES55: CIFAR10_Uniform + 3.472 + 0.291 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES57: CIFAR10_Bernoulli + 1.908 +0.155 + 20%
ES58: CIFAR10_Bernoulli + 1.908 + 0.155 + 40%
ES59: CIFAR10_Bernoulli + 1.908 + 0.155 + 45%
ES60: CIFAR10_Bernoulli + 1.908 + 0.155 + 49%
(3) CH_MNIST:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES21: CH_MNIST_Normal + 1.720 +0.083 + 20%
ES49 : CH_MNIST_Uniform + 1.720 +0.083 + 20%
ES77: CH_MNIST_Bernoulli + 1.720 +0.083 + 20%
RQ2: Effect of Distance between data samples of the Target Dataset
ES04: CH_MNIST_Uniform + 0.954 + 0.108 + 40%
ES14: CH_MNIST_Uniform + 1.355 + 0.108 + 40%
ES24: CH_MNIST_Uniform + 1.720 + 0.108 + 40%
RQ3: Effect of Differential Distances between two datasets
ES03: CH_MNIST_Normal + 0.954 + 0.083 + 45%
ES05: CH_MNIST_Normal + 0.954 + 0.108 + 45%
ES07: CH_MNIST_Normal + 0.954 + 0.133 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES73: CH_MNIST_Bernoulli + 1.355 + 0.133 + 20%
ES74: CH_MNIST_Bernoulli + 1.355 + 0.133 + 40%
ES75: CH_MNIST_Bernoulli + 1.355 + 0.133 + 45%
ES76: CH_MNIST_Bernoulli + 1.355 + 0.133 + 49%
(4) ImageNet:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES02: ImageNet_Normal + 0.934 + 0.046 + 40%
ES30: ImageNet_Uniform + 0.934 + 0.046 + 40%
ES58: ImageNet_Bernoulli + 0.934 + 0.046 + 40%
RQ2: Effect of Distance between data samples of the Target Dataset
ES34: ImageNet_Uniform + 0.934 + 0.08 + 49%
ES44: ImageNet_Uniform + 1.130 + 0.08 + 49%
ES54: ImageNet_Uniform + 1.388 + 0.08 + 49%
RQ3: Effect of Differential Distances between two datasets
ES79: ImageNet_Bernoulli + 1.388 + 0.046 + 45%
ES81: ImageNet_Bernoulli + 1.388 + 0.080 + 45%
ES83: ImageNet_Bernoulli + 1.388 + 0.145 + 45%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES13: ImageNet_Normal + 1.130 + 0.080 + 20%
ES14: ImageNet_Normal + 1.130 + 0.080 + 40%
ES15: ImageNet_Normal + 1.130 + 0.080 + 45%
ES16: ImageNet_Normal + 1.130 + 0.080 + 49%
(5) Location30:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Location30_Normal + 0.570 + 0.041 + 4%
ES29: Location30_Uniform + 0.570 + 0.041 + 4%
ES57: Location30_Bernoulli + 0.570 + 0.041 + 4%
RQ2: Effect of Distance between data samples of the Target Dataset
ES32: Location30_Uniform + 0.57 + 0.076 + 8%
ES42: Location30_Uniform + 0.724 + 0.076 + 8%
ES52: Location30_Uniform + 0.801 + 0.076 + 8%
RQ3: Effect of Differential Distances between two datasets
ES23: Location30_Normal + 0.801 + 0.041 + 12%
ES25: Location30_Normal + 0.801 + 0.076 + 12%
ES27: Location30_Normal + 0.801 + 0.094 + 12%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES73: Location30_Bernoulli + 0.724 + 0.094 + 4%
ES74: Location30_Bernoulli + 0.724 + 0.094 + 8%
ES75: Location30_Bernoulli + 0.724 + 0.094 + 12%
ES76: Location30_Bernoulli + 0.724 + 0.094 + 16%
(6) Purchase100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Purchase100_Normal + 0.550 + 0.087 + 2%
ES29: Purchase100_Uniform + 0.550 + 0.087 + 2%
ES57: Purchase100_Bernoulli + 0.550 + 0.087 + 2%
RQ2: Effect of Distance between data samples of the Target Dataset
ES04: Purchase100_Normal + 0.550 + 0.110 + 4%
ES14: Purchase100_Normal + 0.625 + 0.110 + 4%
ES24: Purchase100_Normal + 0.729 + 0.110 + 4%
RQ3: Effect of Differential Distances between two datasets
ES51: Purchase100_Uniform + 0.729 + 0.087 + 10%
ES53: Purchase100_Uniform + 0.729 + 0.110 + 10%
ES55: Purchase100_Uniform + 0.729 + 0.156 + 10%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES65: Purchase100_Bernoulli + 0.625 + 0.087 + 2%
ES66: Purchase100_Bernoulli + 0.625 + 0.087 + 4%
ES67: Purchase100_Bernoulli + 0.625 + 0.087 + 10%
ES68: Purchase100_Bernoulli + 0.625 + 0.087 + 12%
(7) Texas100:
RQ1: Effect of Distance Distribution of Data Samples in the Target Dataset
ES01: Texas100_Normal + 0.530 + 0.038 + 2%
ES29: Texas100_Uniform + 0.530 + 0.038 + 2%
ES57: Texas100_Bernoulli + 0.530 + 0.038 + 2%
RQ2: Effect of Distance between data samples of the Target Dataset
ES02: Texas100_Normal + 0.530 + 0.038 + 4%
ES10: Texas100_Normal + 0.641 + 0.038 + 4%
ES22: Texas100_Normal + 0.734 + 0.038 + 4%
RQ3: Effect of Differential Distances between two datasets
ES51: Texas100_Uniform + 0.734 + 0.038 + 10%
ES53: Texas100_Uniform + 0.734 + 0.073 + 10%
ES55: Texas100_Uniform + 0.734 + 0.107 + 10%
RQ4: Effect of the Ratios of the samples that are made no inferences by an MI attack
ES65: Texas100_Bernoulli + 0.641 + 0.038 + 2%
ES66: Texas100_Bernoulli + 0.641 + 0.038 + 4%
ES67: Texas100_Bernoulli + 0.641 + 0.038 + 10%
ES68: Texas100_Bernoulli + 0.641 + 0.038 + 12%
Additional Evaluation Results