rfeinman / detecting-adversarial-samples

Code for "Detecting Adversarial Samples from Artifacts" (Feinman et al., 2017)
108 stars 14 forks source link

Final detector is evaluated on data it has been trained on? #5

Open davidglavas opened 5 years ago

davidglavas commented 5 years ago

The code that creates the detector (linear regression classifier):

https://github.com/rfeinman/detecting-adversarial-samples/blob/2c26b603bfadc25521c2bd4c8cc838ac4a484319/scripts/detect_adv_samples.py#L149-L155 Variables values and labels returned by train_lr() represent the training data that the model has been trained on:

def train_lr(densities_pos, densities_neg, uncerts_pos, uncerts_neg):
    # [...data preparing code left out...]

    lr = LogisticRegressionCV(n_jobs=-1).fit(values, labels)
    return values, labels, lr

At the end the detector is evaluated on the data it was trained on (line 159 uses values which is the training data returned by train_lr):

https://github.com/rfeinman/detecting-adversarial-samples/blob/2c26b603bfadc25521c2bd4c8cc838ac4a484319/scripts/detect_adv_samples.py#L157-L168

Evaluating on training data leads to a biased result. Did I miss anything?

yoheikikuta commented 4 years ago

That's what I thought. It seems that the ROC-AUC evaluation uses the same data that was used in the training of a model. Is this implementation as the authors intended?