Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.75k stars 1.15k forks source link

Not generating Adversarial examples #2388

Closed sriharshapalvadi closed 4 months ago

sriharshapalvadi commented 7 months ago

Describe the bug When applying Zoo attack on use case to predict Adult income dataset with Gradient Boosting classifier model, it is not able to generate any adversarial examples. On contrary, It is able to generate the adversarial examples only when I standardize the dataset using minmaxscaler.

To Reproduce Below is the code that I am using from art.estimators.classification import SklearnClassifier

Loading pre-processed data

X,y = shap.datasets.adult() X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=7)

art_classifier = SklearnClassifier(model=model_gb)

Create ART Zeroth Order Optimization attack

zoo = ZooAttack(classifier=art_classifier,confidence=0.0,learning_rate=1e-1, max_iter=20, binary_search_steps=10, use_resize=False,use_importance=False, nb_parallel=1, batch_size=1, variable_h=0.2)

Generate adversarial samples with ART Zeroth Order Optimization attack

adversaries = zoo.generate(np.array(X_test))

Expected behavior I am expecting the algorithm to generate at least few adversarial examples

Screenshots Zoo attack

System information (please complete the following information):

beat-buesser commented 7 months ago

Hi @sriharshapalvadi Thank you very much for using ART! How are you defining the clip_values in SklearnClassifier? It should be set according to the range of the pixel values. Also try running the attack against the true labels instead of the predictions using adversaries = zoo.generate(x=np.array(X_test), y=y_test) to avoid creating for already misclassified samples.