hyperopt / hyperopt-sklearn

Hyper-parameter optimization for sklearn
hyperopt.github.io/hyperopt-sklearn
Other
1.59k stars 272 forks source link

add xgboost to hpsklearn.components #73

Closed ronnyli closed 7 years ago

ronnyli commented 7 years ago

Handy Tip: Sublime Text decided to make a bunch of whitespace changes to the original files. To view the diffs without whitespace changes, add w=1 as a parameter to the end of the diffs URL.

XGBoost is a very popular algorithm in Kaggle competitions as well as at my workplace. As a result I thought it might be helpful to add it to hyperopt-sklearn's collection of components.

This is my first attempt at contributing to an open source project so please don't hesitate to inform me of any steps I'm missing!

bjkomer commented 7 years ago

Thanks for putting together this pull request! Sorry I didn't see it until now, I often miss a lot of my GitHub email.

It looks great, the only minor thing I'm worried about is the extra dependency could potentially cause issues for some users. Luckily there is a way to specify optional dependencies, so I think that is the best route forward. The syntax for that in setup.py would be something like this:

install_requires = [
        'hyperopt',
        ...
]
extras_require = {
        'xgboost':  ['xgboost']
    }

We would also need to wrap the import in a try block or something so that the rest of the code cleanly works if the user doesn't have xgboost installed.

I'm hoping to put together a PyPI release at some point (it is long overdue...) and it would be nice to have your contribution in there as well.

ronnyli commented 7 years ago

Thanks for the feedback, @bjkomer! I'll make those changes ASAP.

ronnyli commented 7 years ago

@bjkomer xgboost should be an optional dependency now!

P.S. when running the tests it is highly recommended to set trial_timeout since xgboost can take a while to train, especially if n_estimators is high