This is a one shot kill attack on transfer learning experiment for the binary classification (dog-vs-fish) that uses inception v3 based on what is described in the Poison Frogs! paper.
We should define three terms for this attack: a) base instance: the image that the poison looks like in image/input space b) poison instance: the instance that gets added to the training data to spur misclassification. Looks like the base instance in input space but like the target instance in feature space. c) target instance: the instance that we are attacking. We are interested in making this instance be misclassified as the class of the base instance.
The attack works by making a poison instance starting from and staying close to the input representation of a base instance that the poison instance's feature representation is close to the target instance's feature representation. Given a target instance, there are many ways that one could select a base instance. We can select a base instance that its feature representation is close to the target instance feature representation, or we can pick a random base instance, or pick any one that we like. This choice my change the number of iterations needed for doing the optimization. In our experiments, we noticed that the size of the base instance is important and can help in ``making'' the poison in less iterations and therefore have selected some preferred ones which their indices are available in the main_oneShot.py script.
To reproduce the results, the following steps should be taken:
Update directories for the saved raw images as needed. It is initially set to main_dog and fish.
If this is the first time running the script, set firsTime = True in the main_oneShot.py script. This will do the following:
If it is not the first time running the script and all of the data is loaded, the remainder of script does the following:
``Poison making'' is done via a Forward-Backward-Splitting Algorithm. For details see the utility file and/or view our paper.
In this part, we check the effectiveness for every attack. Since we are going to be re-using the same graph for every attack, we save the graphdefs to file. This step is a one time step, if you have already done this, no need to re-do it. But if you haven't, you should run make_graph_fr_warm_and_cold.py
Then depending on the attack being warm start or cold start, we should run the appropriate file.
If you find this study useful for your own research, please consider citing the Poison Frogs paper:
@ARTICLE{2018arXiv180400792S,
author = {{Shafahi}, A. and {Ronny Huang}, W. and {Najibi}, M. and {Suciu}, O. and
{Studer}, C. and {Dumitras}, T. and {Goldstein}, T.},
title = "{Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks}",
journal = {ArXiv e-prints},
archivePrefix = "arXiv",
eprint = {1804.00792},
primaryClass = "cs.LG",
keywords = {Computer Science - Learning, Computer Science - Cryptography and Security, Computer Science - Computer Vision and Pattern Recognition, Statistics - Machine Learning},
year = 2018,
month = apr,
adsurl = {http://adsabs.harvard.edu/abs/2018arXiv180400792S},
adsnote = {Provided by the SAO/NASA Astrophysics Data System}
}