ja-thomas / autoxgboost

autoxgboost - Automatic tuning and fitting of xgboost
Other
121 stars 19 forks source link

autoxgboost - Automatic tuning and fitting of xgboost.

Build Status Coverage Status CRAN Status Badge CRAN Downloads

General overview

autoxgboost aims to find an optimal xgboost model automatically using the machine learning framework mlr and the bayesian optimization framework mlrMBO.

Work in progress!

Benchmark

Name Factors Numerics Classes Train instances Test instances
Dexter 20 000 0 2 420 180
GermanCredit 13 7 2 700 300
Dorothea 100 000 0 2 805 345
Yeast 0 8 10 1 038 446
Amazon 10 000 0 49 1 050 450
Secom 0 591 2 1 096 471
Semeion 256 0 10 1 115 478
Car 6 0 4 1 209 519
Madelon 500 0 2 1 820 780
KR-vs-KP 37 0 2 2 237 959
Abalone 1 7 28 2 923 1 254
Wine Quality 0 11 11 3 425 1 469
Waveform 0 40 3 3 500 1 500
Gisette 5 000 0 2 4 900 2 100
Convex 0 784 2 8 000 50 000
Rot. MNIST + BI 0 784 10 12 000 50 000

Datasets used for the comparison benchmark of autoxgboost, Auto-WEKA and auto-sklearn.

Dataset baseline autoxgboost Auto-WEKA auto-sklearn
Dexter 52,78 12.22 7.22 5.56
GermanCredit 32.67 27.67 28.33 27.00
Dorothea 6.09 5.22 6.38 5.51
Yeast 68.99 38.88 40.45 40.67
Amazon 99.33 26.22 37.56 16.00
Secom 7.87 7.87 7.87 7.87
Semeion 92.45 8.38 5.03 5.24
Car 29,15 1.16 0.58 0.39
Madelon 50.26 16.54 21.15 12.44
KR-vs-KP 48.96 1.67 0.31 0.42
Abalone 84.04 73.75 73.02 73.50
Wine Quality 55.68 33.70 33.70 33.76
Waveform 68.80 15.40 14.40 14.93
Gisette 50.71 2.48 2.24 1.62
Convex 50.00 22.74 22.05 17.53
Rot. MNIST + BI 88.88 47.09 55.84 46.92

Benchmark results are median percent error across 100 000 bootstrap samples (out of 25 runs) simulating 4 parallel runs. Bold numbers indicate best performing algorithms.

autoxgboost - How to Cite

The Automatic Gradient Boosting framework was presented at the ICML/IJCAI-ECAI 2018 AutoML Workshop (poster).
Please cite our ICML AutoML workshop paper on arxiv. You can get citation info via citation("autoxgboost") or copy the following BibTex entry:

@inproceedings{autoxgboost,
  title={Automatic Gradient Boosting},
  author={Thomas, Janek and Coors, Stefan and Bischl, Bernd},
  booktitle={International Workshop on Automatic Machine Learning at ICML},
  year={2018}
}