ftramer / FaceCure

MIT License
10 stars 6 forks source link

FaceCure

Oblivious and adaptive "defenses" against poisoning attacks on facial recognition systems.

This repository contains code to evaluate two poisoning attacks against large-scale facial recognition systems, Fawkes and LowKey.

We evaluate the following defense strategies against the Fawkes and Lowkey attacks:

We perform all of our experiments with the FaceScrub dataset, which contains over 50,000 images of 530 celebrities. We use the official aligned faces from this dataset and thus disable the automatic face-detection routines in Fawkes and LowKey.

Attack setup

Perturb images with Fawkes v1.0

Download Fawkesv1.0 here: https://github.com/Shawn-Shan/fawkes

WARNING: For the --no-align option to work properly, we need to patch fawkes/protection.py so that the option eval_local=True is passed to the Faces class.

Then, we generate perturbed pictures for one FaceScrub user as follows:

python3 fawkes/protection.py --gpu 0 -d facescrub/download/Adam_Sandler/face --batch-size 8 -m high --no-align

For each picture filename.png, this will create a perturbed picture filename_high_cloaked.png in the same directory.

Original Picture Picture perturbed with Fawkes
original picture perturbed picture

Perturb images with LowKey

Download LowKey here: https://openreview.net/forum?id=hJmtwocEqzc

We modified the attack script slightly so that the attack does not try to align faces. As a result, the attack can also be batched. The attack automatically resizes pictures to 112x112 pixels as LowKey does not seem to work well with larger pictures.

python3 lowkey_attack.py facescrub/download/Adam_Sandler/face

For each picture filename.png, this will create a resized picture filename_small.png and a perturbed picture filename_attacked.png in the same directory.

Original Picture Picture perturbed with LowKey
original picture perturbed picture

Defense setup

We consider three common facial recognition approaches:

The evaluation code assumes that Fawkesv0.3 is on your PYTHONPATH. Download Fawkesv0.3 here: https://github.com/Shawn-Shan/fawkes/releases/tag/v0.3

We assume you have:

In each of the experiments below, one FaceScrub user is chosen as the attacker. All of the training images of that user are replaced by perturbed images.

A facial recognition model is then trained on the entire training set. We report the attack's protection rate (a.k.a. the trained model's test error when evaluated on unperturbed images of the attacking user).

Baseline evaluation with NN and linear classifiers

NN classifier

Train a nearest neighbor classifier on top of the Fawkesv0.3 feature extractor, with one attacking user.

To evaluate Fawkes' attack:

python3 eval.py --facescrub-dir facescrub/download/ --attack-dir facescrub_fawkes_attack/download/ --unprotected-file-match .jpg --protected-file-match high_cloaked.png --classifier NN --names-list Adam_Sandler 

To evaluate LowKey's attack:

python3 eval.py --facescrub-dir facescrub/download/ --attack-dir facescrub_lowkey_attack/download/ --unprotected-file-match small.png --protected-file-match attacked.png --classifier  NN --names-list Adam_Sandler 

Results:

Fawkes (baseline NN) Lowkey (baseline NN)
Protection rate: 0.97|Protection rate: 0.94

Thus, both attacks are very effective in this setting: the model only classifies <6% of the user's unperturbed images correctly.

Linear classifier

You can set --classifier linear to instead train a linear classifier instead of a nearest neighbor one.

Defense evaluation with NN and linear classifiers

Oblivious NN classifier

We can repeat the experiment using the feature extractor from Fawkes v1.0, MagFace or CLIP.

Fawkes attack & Fawkes v1.0 extractor:

python3 eval_oblivious.py --facescrub-dir facescrub/download/ --attack-dir facescrub_fawkes_attack/download/ --unprotected-file-match .jpg --protected-file-match high_cloaked.png --classifier  NN --names-list Adam_Sandler --model fawkesv10

Fawkes attack & MagFace extractor:

python3 eval_oblivious.py --facescrub-dir facescrub/download/ --attack-dir facescrub_fawkes_attack/download/ --unprotected-file-match .jpg --protected-file-match high_cloaked.png --classifier  NN --names-list Adam_Sandler --model magface --resume path/to/magface_model.pth

LowKey attack & MagFace extractor:

python3 eval_oblivious.py --facescrub-dir facescrub/download/ --attack-dir facescrub_lowkey_attack/download/ --unprotected-file-match small.png --protected-file-match attacked.png --classifier  NN --names-list Adam_Sandler --model magface --resume path/to/magface_model.pth

LowKey attack & CLIP extractor:

python3 eval_oblivious.py --facescrub-dir facescrub/download/ --attack-dir facescrub_lowkey_attack/download/ --unprotected-file-match small.png --protected-file-match attacked.png --classifier  NN --names-list Adam_Sandler --model clip

Results:

Fawkes attack & Fawkes extractor Fawkes attack & MagFace extractor LowKey attack & MagFace extractor LowKey attack & CLIP extractor
Protection rate: 1.00|Protection rate: 0.00| |Protection rate: 1.00|Protection rate: 0.24

The Fawkes attack completely fails against MagFace: all of the user's unperturbed pictures are classified correctly.

LowKey fairs a bit better: it works perfectly against MagFace, but performs poorly against CLIP, where it only protects the user for 24% of the tested pictures.

Adaptive NN classifier

Same as for the baseline classifier above, but you can add the option --robust-weights cp-robust-10.ckpt to use a robustified feature extractor. This feature extractor was trained using the train_robust_features.py script, which finetunes a feature extractor on known attack pictures.

Results:

Fawkes (adaptive NN) Lowkey (adaptive NN)
Protection rate: 0.03|Protection rate: 0.03

Both attacks fail in this setting. The model achieves an error rate on unperturbed pictures of just 3%.

Linear classifiers

You can set --classifier linear to instead train a linear classifier instead of a nearest neighbor one.

Baseline evaluation with end-to-end training

Train a classifier on top of the Fawkesv03 feature extractor end-to-end, with one attacking user.

To evaluate Fawkes' attack:

python3 eval_e2e.py --gpu 0 --attack-dir facescrub_fawkes_attack/download/Adam_Sandler/face --facescrub-dir facescrub/download/ --unprotected-file-match .jpg --protected-file-match high_cloaked.png

To evaluate LowKey's attack:

python3 eval_e2e.py --gpu 0 --attack-dir facescrub_lowkey_attack/download/Adam_Sandler/face --facescrub-dir facescrub/download/ --unprotected-file-match small.png --protected-file-match attacked.png

Results:

Fawkes (baseline E2E) Lowkey (baseline E2E)
Protection rate: 0.88|Protection rate: 0.97

Both attacks are very effective in this setting: the trained model only classifies respectively 12% and 3% of the user's unperturbed images correctly.

Defense evaluation with end-to-end training

Adaptive end-to-end

To evaluate robust end-to-end training, we add perturbed pictures into the model's training set. We assume here that at most half of the FaceScrub users have perturbed pictures.

To evaluate Fawkes' attack:

python3 eval_e2e.py --gpu 0 --attack-dir facescrub_fawkes_attack/download/Adam_Sandler/face --facescrub-dir facescrub/download/ --unprotected-file-match .jpg --protected-file-match     high_cloaked.png --robust --public-attack-dirs facescrub_fawkes_attack/download facescrub_lowkey_attack/download

To evaluate LowKey's attack:

python3 eval_e2e.py --gpu 0 --attack-dir facescrub_lowkey_attack/download/Adam_Sandler/face --facescrub-dir facescrub/download/ --unprotected-file-match small.png --protected-file-     match attacked.png --robust --public-attack-dirs facescrub_fawkes_attack/download facescrub_lowkey_attack/download

Results:

Fawkes (adaptive E2E) Lowkey (adaptive E2E)
Protection rate: 0.03|Protection rate: 0.03

Again, both attacks fail against a robustified model. The model achieves an error rate on unperturbed pictures of just 3%.