StephanZheng / neural-fingerprinting

BSD 3-Clause "New" or "Revised" License
29 stars 17 forks source link

Can't reproduce results, issue with adversarial example sampling #2

Closed shairoz closed 5 years ago

shairoz commented 5 years ago

Following the documentation, I was able to train and evaluate a model on cifar10. I trained a model with 100 epoch and got an accuracy of 87% with 30 fingerprints and tested against the CW-L2 attack, reducing the number of iterations to 10 (should make it easier to defend, success rate was around 70%). I got an AUC of ~92%, which is less than reported, Additionally, when examining the AUC calculation procedure, I've noticed that the iterator of fp_eval.py line 163: for e, (x,y) in enumerate(data_loader):

which is meant to iterate over the adversarial samples created, only runs once on 14 samples, causing the AUC to be calculated on a very small subset.

code to reproduce: ---------------------- TRAINING ------------------- python neural-fingerprinting/cifar/train_fingerprint.py --batch-size 128 --test-batch-size 128 --epochs 100 --lr 0.01 --momentum 0.9 --seed 0 --log-interval 10 --log-dir neural-fingerprinting/cifar/log --data-dir /tmp/nfp/cifar/data  --eps=0.1 --num-class=10 --num-dx=10 --name=cifar

------------------ CREATING ADVERSARIAL SAMPLES (manually setting the number of iterations to 10) ---------------- neural-fingerprinting/cifar/gen_whitebox_adv.py --attack cw-l2 --ckpt cifar/log/ckpt/state_dict-ep_100.pth --log-dir cifar/log/adv --batch-size 128

------------------ Evaluating ---------------- neural-fingerprinting/cifar/eval_fingerprint.py

--batch-size 128 --epochs 100 --lr 0.001 --momentum 0.9 --seed 0 --log-interval 10 --ckpt cifar/log/ckpt/state_dict-ep_100.pth --log-dir cifar/log/eval --fingerprint-dir cifar/log/ --adv-ex-dir cifar/log/adv --data-dir /tmp/nfp/cifar/data --eps=0.1 --num-dx=10 --num-class=10 --name=cifar

I would appreciate a clarification on what could have gone wrong. Thanks

dathath commented 5 years ago

I believe that the evaluation here is problematic. If it's not an adversarial example, it's quite possible that it won't be detected. Can you look at the samples that actually changed label and try to detect them? Looking at samples that were randomly perturbed to a small extent isn't a meaningful evaluation in my opinion.

Perhaps you could start with PGD or FGSM. CW can take a few iterations to converge.

But to re-state my previous comment, if the samples are not adversarial, then it's not easier to detect them -- It's much harder on the contrary. If you set iterations to 1, the AUC-ROC would drop even more. And if you set iterations to 0, the AUC-ROC would drop to ~0.5.

dathath commented 5 years ago

I found the issue that is related to 14 samples -- https://github.com/StephanZheng/neural-fingerprinting/blob/master/cifar/custom_datasets.py. Can you update the last line appropriately as the length of your dataset? I can make the change sometime if you can't figure it out.

dathath commented 5 years ago

Closed because of no activity. Feel free to reopen if you still can't reproduce based on my suggestions.