Plan for the pipeline [UPDATED]

someshsingh22 commented 4 years ago

I will put the assumed pseudocode as well first let's discuss the features / plan in words NOTE : THIS IS FOR SINGLE PASS ATTACKS ONLY We missed out on approaches where Black box attacks get classification results from models

SINGLE PASS

[ ] Load Dataset Implement Classification, Translation, NER, Entailment in order Add standard datasets of each type IMDB, English-Chinese SST etc
[ ] Data Loader We can offer our own as well Torchtext/Allennlp User can define his own dataloaders
[ ] Create Adversaries Data Loaders / Datasets keeping option (2) in mind and give user an option of top k attacks to be kept
[X] User implements model
[X] User tests his model on actual dataset and adversarial dataset
[ ] Display the results Give users a grid option for metrics, extractors, and transforms Show user the ETA Show top k best attacks and their results.

TRAIN

[X] User implements his test function
[ ] Set grid trainer Pass the test function to the trainer along with grid to the grid trainer
[ ] Train the decepticon
[ ] Generate adversaries from top_k results
[ ] Show results

Additional - Can add three version of decepticons, strong, stealthy and balanced top_k rankings will be done on basis of fall of accuracy (fall), metric distance, weighted-mean

someshsingh22 commented 4 years ago

from decepticonlp import attack
from keras.datasets import imdb
import torch #can be keras / tf

#Load Data

train, test = imdb.load_data()

#Generate Adversaries
#grid will be params consisting of metrics, transforms

attacker = attack.decepticon(task="classification", method="single_pass")
adversaries = attacker.generate_from_grid(train, grid, top_k=3)
adversaries_list = adversaries.get_list()     #returns list of datasets/dataloaders

#Load Model
MODEL_PATH = "PATH_TO_MODEL"
model = torch.load(MODEL_PATH)

#Import inference class / User implements his own loop
from user_model import test

#Display results
results = adversaries.evaluate(test)
results.show()

#IMPORTANT
#For methods that receive the classification labels / translated text etc
#Intit attacker and training setup

attacker = attack.decepticon(task="classification", method="train")
attacker.set_grid_trainer(train, grid, test)
attacker.train(epochs=10, **kwargs) #train

#Generate Adversaries

adversaries = attacker.generate_from_trainer(top_k=3)
adversaries_list = adversaries.get_list()     #returns list of datasets/dataloaders

#Display results
results = adversaries.evaluate(test)
results.show()

someshsingh22 commented 4 years ago

If the tasks look separate enough please ping me, I will separate the trainer and single pass method

abheesht17 commented 4 years ago

@someshsingh22 @rajaswa A few questions. Suppose:

grid = ['metrics': 'accuracy', 'strategies': transforms.AddChar(), transforms.DeleteChar() ...]
adversaries = attacker.generate_from_grid(train, grid, top_k=3)
adversaries_list = adversaries.get_list()     #returns list of datasets/dataloaders

... (1)

Now, we intend to generate top_k adversaries based on the accuracy computed (lower the accuracy --> better the adversary).

But to compute accuracy, we will need to do a forward pass. Where do we do that?

We do it later:

#Import inference class / User implements his own loop
from user_model import test

#Display results
results = adversaries.evaluate(test)
results.show()

Won't we have to pass the user's loop in (1), however, in order to compute metrics like accuracy/F1 Score?

someshsingh22 commented 4 years ago

Top K is for showing the results, the metrics will be computed in results.evaluate(test) where test is the user's loop

SforAiDl / decepticonlp

Plan for the pipeline [UPDATED] #79