Trevillie / MagNet

MagNet: a Two-Pronged Defense against Adversarial Examples
BSD 2-Clause "Simplified" License
95 stars 35 forks source link

Implementation of the CIFAR autoencoder/reformer #2

Closed mchenchen closed 6 years ago

mchenchen commented 6 years ago

Hi I've been trying to reproduce your CIFAR results for a couple of weeks now, but after following the architecture in your paper, I've only gotten ~40% accuracy with detector and reformer. Would it be possible to upload your implementation of the CIFAR MagNet architecture? Thank you

Trevillie commented 6 years ago

Hey sorry I didn't keep the binary of autoencoders and we've moved on to new designs. BTW, on which attack and threat model did you get this result?

mchenchen commented 6 years ago

Hi thanks for the prompt response! I am using carlini's nn_robust_attack using confidence 0.0 and L2 metric. I generated 10000 randomly targeted images from the CIFAR test set and I am also using his classifier, which achieves an accuracy of ~80% on normal test examples.

Here is my test_defense.py code

idx = range(10000)
_, _, Y = prepare_data(CIFAR(), idx)
f = "cifar_test_targeted_batch9_data_subset"
testAttack = AttackData(f, Y, "Carlini L2 0.0")

detector_II = AEDetector("./defensive_models/CIFAR_II", p=1)
reformer = SimpleReformer("./defensive_models/CIFAR_II")
id_reformer = IdReformer()
def fn(correct, predicted):
    return tf.nn.softmax_cross_entropy_with_logits(labels=correct, logits=predicted / 1)
classifier = Classifier("../models/cifar", fn)

db_detector_I = DBDetector(reformer, id_reformer, classifier, T=10)
db_detector_II = DBDetector(reformer, id_reformer, classifier, T=40)

detector_dict = dict()
detector_dict["I"] = db_detector_I
detector_dict["II"] = detector_II
detector_dict["III"] = db_detector_II

operator = Operator(CIFAR(), classifier, detector_dict, reformer)

evaluator = Evaluator(operator, testAttack)
evaluator.plot_various_confidences("defense_performance_cifar", confs=[0.0], drop_rate={"I": 0.01, "II": 0.005, "III":0.01}, f=f, idx_file=idx)

and here is my train_defense.py code

shape = [32, 32, 3]
combination_II = [3]
activation = "sigmoid"
noise = 0.025
epochs = 400

data = CIFAR()

AE_II = DAE(shape, combination_II, v_noise=noise, activation=activation)
AE_II.train(data, "CIFAR_II", num_epochs=epochs)

Have I misinterpreted something in your paper?

Trevillie commented 6 years ago

Well, I'd suggest:

  1. Use a classifier with higher accuracy (like 90%+).
  2. Try a wider/deeper autoencoder structure.
mchenchen commented 6 years ago

OK thank you for the feedback - I'll try it out

mchenchen commented 6 years ago

Hi could you possibly give more details on the autoencoder structure you guys used?

Trevillie commented 6 years ago

Hi, we didn't try our old model but we can get a even better result with the following structure: input -> 3x3x32 conv -> BN -> Relu -> 3x3x32 conv -> Relu ->BN -> 3x3x3 conv -> output

BTW, since we don't have our old classifier, we used a DenseNet classifier with acc ~92%.

1453107770

Hope this helps.

ashleyxly commented 6 years ago

I am actually confused. Is the architecture you show the detector model? Or detector and reformer model? Would you please clarify? Thanks!

Trevillie commented 6 years ago

@ashleyxly Both. But note that we did this purely out of simplicity. Better result is expected if you opt for different autoencoders.

ashleyxly commented 6 years ago

I see. So adversarial examples are trained by DenseNet, and you used the autoencoder to detect and reform them?

Trevillie commented 6 years ago

@ashleyxly Yep.

EmotionalXX commented 4 years ago

Hi, we didn't try our old model but we can get a even better result with the following structure: input -> 3x3x32 conv -> BN -> Relu -> 3x3x32 conv -> Relu ->BN -> 3x3x3 conv -> output

BTW, since we don't have our old classifier, we used a DenseNet classifier with acc ~92%.

1453107770

Hope this helps.

Hi,I built the network according to the classifier network model tested on the cifar10 dataset in your paper, but the classification accuracy is only 75%. At the same time, using this classifier for testing, the performance of the entire defense model is poor. Can you provide the original cifar10 network file? Or this densenet network file. Thank you very much!

Trevillie commented 4 years ago

@EmotionalXX Hi. I don't have the original classifier anymore. It should be straightforward to train a decent classifier on CIFAR with architectures like DenseNet though.