Trusted-AI / adversarial-robustness-toolbox

Adversarial Robustness Toolbox (ART) - Python Library for Machine Learning Security - Evasion, Poisoning, Extraction, Inference - Red and Blue Teams
https://adversarial-robustness-toolbox.readthedocs.io/en/latest/
MIT License
4.74k stars 1.15k forks source link

Add support for scikit-learn models #47

Closed ririnicolae closed 4 years ago

ririnicolae commented 5 years ago

Investigate to which extent models from scikit-learn can fit under the Classifier interface. Some features from the ART API might not be available for this framework. The analysis should cover what those features are. Similarly, if ART can only support some models from scikit-learn, a list of these should be provided.

Depending on the result of the analysis, the list of features to be implemented will be scoped.

imolloy commented 5 years ago

I can help with this one.

beat-buesser commented 5 years ago

I'm interested to help

ririnicolae commented 5 years ago

@beat-buesser @imolloy I think this is quite a big task, so I suggest you share responsibilities. One of you could perform the analysis discussed above. Based on that, you can split the implementation between you. Let me know what you think.

imolloy commented 5 years ago

@ririnicolae Completely agree. We should start with a list of the models we're interested in, e.g., classifiers, regression, etc. Next, there's a question about whether or not the existing sklearn interfaces are going to be sufficient, or if we need to be model-specific. For example, using the fit, predict, predict_proba, transform, fit_transform will be sufficient (implies blackbox attacks) or if we need to dig into any of the models themselves, e.g.,.coef_ and intercept_ for LinearRegression and SVC (along with dual_coef_), and so on.

My initial target would be SVC and LogisticRegression.

ririnicolae commented 5 years ago

@imolloy I think classification is a good place to start, and we should at least be able to cover SVC & LogisticRegression, as you suggested. One of the questions at this point is: will we be able to extract gradients from scikit-learn? These would be vital for white-box attacks (class_gradient and loss_gradient in the Classifier API). But you're right, we can always limit the support to black-box attacks if we can't get gradients.

imolloy commented 5 years ago

Looking at the main modules and classes, this list might get us started. I don't think we'll be able to write a generic gradient function in a whitebox setting, but we can default to a blackbox approximation using something similar to ZOO or NES. We can try to computing gradients for some of the simpler classes and see how it goes.

beat-buesser commented 5 years ago

Sounds good to me. I have created a new development branch development_sklearn and will start to build there prototypes for sklearn.linear_model.LogisticRegression to explore some of the challenges ahead.

beat-buesser commented 5 years ago

I have pushed a prototype of a classifier for sklearn.linear_model.LogisticRegression to branch development_sklearn in 49a9429d5bfe10c464997f9a6a633cedae87c5e6.

The new notebook sklearn_logistic_regression.ipynb includes examples using the MNIST dataset and art.attacks.projected_gradient_descent.ProjectedGradientDescent.

So far I can see a few TODOs remaining and I'll continue working on them:

Please let me know what you think.

beat-buesser commented 5 years ago

I have created a new development branch development_sklearn_SVM and started to implement support for Support Vector Machines.

beat-buesser commented 5 years ago

I started working on a classifier for sklearn.svm.SVC. This is a short list of items that I'm implementing:

beat-buesser commented 5 years ago

The ART classifier for sklearn.svm.SVC has now loss_gradients for linear and rbf kernels. Check out the PGD attack examples on Iris and MNIST datasets in sklearn_svm_svc.ipynb

beat-buesser commented 5 years ago

I have created a new development branch development_decision_trees for the exploration and development on decision tree classifiers from sklearn.tree.DecisionTreeClassifier, sklearn.ensemble.{RandomForestClassifier, AdaBoostClassifier, GradientBoostingClassifier}, XGBoost, LightGBM and CatBoost and related attacks and defenses.

This is an interesting, very recent article on adversarial examples and robustness for decision trees: https://arxiv.org/abs/1902.10660

beat-buesser commented 5 years ago

Progress with tree-based classifiers:

beat-buesser commented 5 years ago

All new classifiers and example notebooks have been merged to branch development_sklearn. There development will continue to implement unit tests and improve notebooks.