Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.87k stars 1.17k forks source link

Possible AdversarialTrainer bug: TypeError: fit() got an unexpected keyword argument 'nb_epochs' #996

Closed matthew-64 closed 3 years ago

matthew-64 commented 3 years ago

Describe the bug I have been following the get_started_scikit_learn.py example. This works as expected.

However when I try to adversarially train the classifier like so:

# Step 3: Create the ART classifier
model = SVC(C=1.0, kernel="rbf")
classifier = SklearnClassifier(model=model, clip_values=(min_pixel_value, max_pixel_value))

# Step 4: Train the ART classifier
classifier.fit(x_train, y_train)

# Additional adversarial raining step:
attack = FastGradientMethod(estimator=classifier, eps=0.2)
ad_trainer = AdversarialTrainer(classifier=classifier, attacks=attack)
ad_trainer.fit(x_train, y_train)

The following error occurs:

Precompute adv samples: 100%|██████████| 1/1 [00:00<00:00, 6754.11it/s]
Adversarial training epochs:   0%|          | 0/20 [00:45<?, ?it/s]
Traceback (most recent call last):
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/get_started_scikit_learn.py", line 71, in <module>
    ad_trainer.fit(x_train, y_train)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/defences/trainer/adversarial_trainer.py", line 241, in fit
    self._classifier.fit(x_batch, y_batch, nb_epochs=1, batch_size=x_batch.shape[0], verbose=0, **kwargs)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/estimators/classification/classifier.py", line 71, in replacement_function
    return fdict[func_name](self, *args, **kwargs)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/estimators/classification/scikitlearn.py", line 156, in fit
    self.model.fit(x_preprocessed, y_preprocessed, **kwargs)
TypeError: fit() got an unexpected keyword argument 'nb_epochs'

Process finished with exit code 1

Possible cause Assuming (on the unlikely case) I have not done something wrong I have done a bit of digging and I think I may see an issue:

From ART requirements.txt: keras>=2.2.5

However I see from the Keras 2.0 release notes: "_The nbepoch argument has been renamed epochs everywhere"

Could this be the cause? Or am I calling .fit(...) on my adversarial trainer incorrectly? Many thanks in advance.

Pip freeze

adversarial-robustness-toolbox==1.6.0
asgiref==3.3.1
cma==3.0.3
cycler==0.10.0
Django==3.1.7
ffmpeg-python==0.2.0
future==0.18.2
joblib==1.0.1
kiwisolver==1.3.1
llvmlite==0.35.0
matplotlib==3.3.4
mypy==0.812
mypy-extensions==0.4.3
numba==0.52.0
numpy==1.20.1
pandas==1.2.3
patsy==0.5.1
Pillow==8.1.2
pydub==0.25.1
pyparsing==2.4.7
python-dateutil==2.8.1
pytz==2021.1
resampy==0.2.2
scikit-learn==0.24.1
scipy==1.6.1
six==1.15.0
sklearn==0.0
sqlparse==0.4.1
statsmodels==0.12.2
threadpoolctl==2.1.0
tqdm==4.59.0
typed-ast==1.4.2
typing-extensions==3.7.4.3
beat-buesser commented 3 years ago

Hi @matthew-64 This is an interesting but unexpected combination of adversarial trainer and model. It looks like the model is scikit-learn's SVC, therefore the issue should not be caused by Keras. The estimator for SVC provides loss gradients as discussed in #991 but it is not expected by AdversarialTrainer which trains the model over multiple iterations nb_epochs (argument of AdversarialTrainer.fit) running fit of the provided classifier for 1 epoch (nb_epoch=1 for classifier.fit) in each iteration.

The method fit of ScikitlearnSVC does not accept an nb_epochs argument as the fitting of SVC completes training with a single call to fit, therefore we cannot just proceed one optimisation step as required by AdversarialTrainer.

If you would like to experiment with AdversarialTrainer I would recommend to use a neural network model with one of the deep learning frameworks.

matthew-64 commented 3 years ago

Hi @beat-buesser,

Thank you for your response. Just taking another look at AdversarialTrainer, I see that it expects a classifier of CLASSIFIER_LOSS_GRADIENTS_TYPE. Therefore, should it not be able to train a classifier of type ScikitlearnLogisticRegression or ScikitlearnSVC? (These are the two classifiers that I am needing to train.

Accordingly, I changed my code to the following, passing ScikitlearnLogisticRegression which is part of CLASSIFIER_LOSS_GRADIENTS_TYPE:

from sklearn.linear_model import LogisticRegression
from art.estimators.classification.scikitlearn import ScikitlearnLogisticRegression
from art.attacks.evasion import FastGradientMethod
from art.defences.trainer import AdversarialTrainer

sk_lr = LogisticRegression(C=0.1, penalty='l2', solver='lbfgs', multi_class='ovr', max_iter=1000)
art_lr = ScikitlearnLogisticRegression(sk_lr)
art_lr.fit(data, target)

attack = FastGradientMethod(art_lr)

adv_trained_lr = ScikitlearnLogisticRegression(sk_lr)
adv_trainer = AdversarialTrainer(classifier=adv_trained_lr, attacks=attack)

adv_trainer.fit(data, target)

The adv_trainer.fit(data, target) will produce the following error:

Precompute adv samples:   0%|          | 0/1 [00:00<?, ?it/s]
Precompute adv samples: 100%|██████████| 1/1 [00:00<00:00, 30.31it/s]
Adversarial training epochs:   0%|          | 0/20 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/local_testing/adversarial.py", line 274, in <module>
    run_all_tests_ad_training()
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/local_testing/adversarial.py", line 215, in run_all_tests_ad_training
    adv_trainer.fit(data.to_numpy(), target)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/defences/trainer/adversarial_trainer.py", line 241, in fit
    self._classifier.fit(x_batch, y_batch, nb_epochs=1, batch_size=x_batch.shape[0], verbose=0, **kwargs)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/estimators/classification/classifier.py", line 71, in replacement_function
    return fdict[func_name](self, *args, **kwargs)
  File "/Users/matthew/Desktop/uni_work/project/NIDS_Masters_Project/venv/lib/python3.7/site-packages/art/estimators/classification/scikitlearn.py", line 156, in fit
    self.model.fit(x_preprocessed, y_preprocessed, **kwargs)
TypeError: fit() got an unexpected keyword argument 'nb_epochs'

Process finished with exit code 1

Both the data and target values are of type class 'numpy.ndarray'>

I have tried but am really not seeing what I am doing wrong. Is there anyway I can adversarily train the ScikitlearnLogisticRegression or ScikitlearnSVC? Any help with this matter would be hugely appreciated!

Quick reference

CLASSIFIER_LOSS_GRADIENTS_TYPE = Union[
        ClassifierLossGradients,
        EnsembleClassifier,
        GPyGaussianProcessClassifier,
        KerasClassifier,
        MXClassifier,
        PyTorchClassifier,
        ScikitlearnLogisticRegression,
        ScikitlearnSVC,
        TensorFlowClassifier,
        TensorFlowV2Classifier,
    ]
beat-buesser commented 3 years ago

Hi @matthew-64

In this case CLASSIFIER_LOSS_GRADIENTS_TYPE is a minimum requirement of AdversarialTrainer on the estimator, beyond that it also requires that the model in the estimator can be updated for a single optimisation step (e.g. nb_epochs=1) like neural networks implemented in a deep learning framework, but not scikit-learn models with their method fit.

Are you trying to reproduce a specific paper?

beat-buesser commented 3 years ago

I just remembered that scikit-learn's SVC and LogisticRegression have an argument max_iter. You could try to modify the code to replace nb_epochs=1 with max_iter=1 in the model's arguments, but I'm not sure if the models would still converge or become robust by adversarial training and only LogisticRegression, but not SVC, provides the warm_start option.

matthew-64 commented 3 years ago

Hi @beat-buesser.

To reply to your 1st comment, there is no specific paper that I am trying to match. I am simply trying to adversarly train a LR and SVM model against an evasion attack. I was thinking of trying to come up with my own solution that will adversarily train a model following this psudo-code:

data, target = read_data()
model = ScikitlearnLogisticRegression(...)
completed_cycles = 0
while cycles <= completed_cycles:
    # Train model with existing data
    model.fit(data, target)

    # Generate adversarial attack data to be used in the next training iteration of this loop
    attack = FastGradientMethod(estimator=classifier, eps=0.2)
    data = attack.generate(x=data)
    completed_cycles += 1

However, I am no expert on this. I am thinking out loud.

As to your second comment, thanks I will look into it.

Regrads, Matthew

beat-buesser commented 3 years ago

Hi @matthew-64

I think your pseudo-code goes into the correct direction. I would recommend to generate the adversarial examples with the true labels data = attack.generate(x=data, y=target) assuming target are the true labels of data. With adversarial training you are trying to solve a min-max-optimisation problem, therefore you don't want to train the model to convergence in model.fit(data, target). Adversarial training also depends on the model and hyperparameters, not every combination works.

Which dataset are you using? What is your reason to use Logistic Regression?

matthew-64 commented 3 years ago

Hey @beat-buesser,

I have edited my code accordingly. Thank you for the suggestion.

Can I also clarify what you meant by your statement: "With adversarial training you are trying to solve a min-max-optimisation problem, therefore you don't want to train the model to convergence in model.fit(data, target)". Are you saying that the more training iterations that take place, the closer the model comes to convergence with the training data, so each iteration will have a reduced effect until there is no effect at all?

To answer your question, there is no official dataset that I am using. I have a NIDS that is used to detect a SYN flood. The System reads the network traffic and builds a feature set according to the packets that have been read by it. However when I perturb certain features such as the payload of the SYN flood packets, the detection accuracy can drop down to as low as 0%. I am using RF, KNN, SVM and LR sklearn classifiers to preform this.

I am trying to improve the robustness of the mentioned classifiers by adversarially training them on attacks that the literature suggests (BIM, FGM etc...)

In terms of the adversarial training, I have encountered a factor that I did not previously consider for the retraining process. I would be very interested on your opinion of it. I have changed the pseudo code in my previous comment so that only a random subsection of the attacks will be perturbed and returned by the attack.generate(...). No background traffic will be perturbed:

data, target = read_data()

# Select random attacks from the existing dataset to perturb into an adversarial attack
ad_attacks, ad_target = select_random_attacks(data, target)

model = ScikitlearnLogisticRegression(...)

completed_cycles = 0
while cycles <= completed_cycles:
    # Train model with existing data
    model.fit(data, target)

    # Generate adversarial attack data to be used in the next training iteration of this loop
    attack = FastGradientMethod(estimator=classifier, eps=0.2)
    ad_attacks = attack.generate(x=ad_attacks, y=ad_target)

    data, target = replace_attacks_with_adversarial_attacks(data, ad_attacks, target, ad_target)

    completed_cycles += 1

So from the pseudo code above I am: 1) selecting a random subset of attacks 2) perturbing that subset into adversarial examples 3) Replacing the normal attacks with its adversarial counterpart 4) Train the model on the mixture of adversarial and normal data 5) Repeat (if required)

I would like to draw your attention as to exactly how I am retraining the model. In your opinion does it make more sense to either:

Thanks again, Matthew

beat-buesser commented 3 years ago

Hi @matthew-64

With the min-max optimisation I was referring, for example, to equation 2.1 of Madry et al., 2017 which is central to adversarial training of neural networks and the trainer tools of ART. This and related papers might also answer your questions on the training algorithm.

I think your application and dataset are interesting, but I'm not sure I have seen any adversarial training results published, therefore you'll need to investigate if adversarial training is possible and with which parameters or find and reproduce existing experiments.

Let me what your experiments find out!

matthew-64 commented 3 years ago

@beat-buesser, I have started testing the trainer with a variety of considerations. I will let you know the outcome.