mlp output for eps=0 (retrain, recons) not matching up with unattacked model

paullintilhac commented 2 years ago

I haven't actually looked at the recons output, but I assume there is the same problem there. I believe uttam was able to get this to work better on an older commit. Should at least be within a couple percentage points of the accuracy without an attack.

uttam-rao commented 2 years ago

The accuracy vals might be off for the baseline models too, not just when adversarial perturbation is added. For example, when running the recons defense on the cnn the reported accuracy before any attacks or defenses is 96.70, but with the adversarial examples with dev_mag=0.1 the accuracy jumps up to 98.43 which doesn't make sense.

The older commit does give accuracies which are consistent with the paper and make sense intuitively for all models. For example, when running the same cnn model with the older commit the baseline before any attacks or defenses has an accuracy of 98.99 and drops down to 98.71 with dev_mag=0.1 which makes sense. The results of the older commit can be seen in the 189_project_older_commit.ipynb colab notebook which was shared with you a while ago.

uttam-rao commented 2 years ago

Resolved now in the accuracy_fix branch. Error was that X_test_orig -= mean was commented out in the attack_wrapper().

paullintilhac / cosc189-project

mlp output for eps=0 (retrain, recons) not matching up with unattacked model #6