hyperopt / hyperopt-sklearn

Hyper-parameter optimization for sklearn
hyperopt.github.io/hyperopt-sklearn
Other
1.58k stars 271 forks source link

Add parameter n_jobs for multiprocessing to hyperopt_estimator #138

Closed DavidBreuer closed 3 years ago

DavidBreuer commented 5 years ago

Hi, this PR addresses the discussion in issue https://github.com/hyperopt/hyperopt-sklearn/issues/82 and especially comment https://github.com/hyperopt/hyperopt-sklearn/issues/82#issuecomment-430963445.

To support (at least some) multiprocessing, I added the well-known sklearn/joblib parameter n_jobs to the hyperopt_estimator function. Whenever an estimator is called which supports multiprocessing, it is passed n_jobs as argument:

estim = hpsklearn.HyperoptEstimator(..., n_jobs=2)

Two caveats: First, for smaller data sets, this rather slows some sklearn functions down due to parallelization overhead (cf. e.g. https://github.com/scikit-learn/scikit-learn/issues/6645 or https://github.com/scikit-learn/scikit-learn/issues/8216). However, a quick analysis showed that for larger datasets there may be some benefit by using multiple cores:

Figure_1

Legend: Color encodes number of samples for a dummy sklearn.datasets.make_classification task. y-axis shows time needed to perform 30 evals relative to time with one core. Random seeds were set to ensure that hyperopt runs were identical. Repetitions would be needed for reliable time estimates etc. but this should be enough for demonstration purposes.

Second, it would be nice to parallelize the cross-validation part in _cost_fn but there is an interesting for-else statement that I could not handle using joblib... :) Suggestions are welcome.

I'm happy for ideas how to improve this PR. In case this PR should not be merged, that's perfectly fine, too. Thanks and best!

linminhtoo commented 3 years ago

I support this PR. May I ask any reason for it hasn't been merged?

DavidBreuer commented 3 years ago

Thanks for the support. I rebased my branch with the current original master branch to re-enable merging. If further adjustments are needed, please let me know.

bjkomer commented 3 years ago

Thanks for the PR! I missed this earlier, but will take a look through this weekend and test it out.

bjkomer commented 3 years ago

Looks good to me!