Open vecorro opened 3 years ago
Thanks for the minimal example; I can reproduce.
I'm not sure Dask-ML's XGBRegressor and XGBoost's XGBRegressor should work with HyperbandSearchCV
; they don't have the partial_fit
function, a requirement for using incremental hyperparameter optimization.
It might be possible to jerry-rig XGBoost into supporting partial_fit
; good references are https://github.com/dmlc/xgboost/issues/3055 and this SO question.
Thanks,
I think that Dask and XGBoost make a great couple. At the moment it seems to be that more suited for situations where data fits in memory. I'm starting to use HyperbandSearchCV and it's very useful (fast and performant), I think that getting this integration would draw more people to adopt Dask for tabular data problems that where the dataset is bigger than the system memory.
Should I close the issue based on the fact that the functionality I was expecting to use actually is not implemented?
I don't think so; I've filed #840 to make the error more clear.
!-- Please include a self-contained copy-pastable example that generates the issue if possible.
Please be concise with code posted. See guidelines below on how to provide a good bug report:
Bug reports that follow these guidelines are easier to diagnose, and so are often handled much more quickly. -->
What happened: Got an error from XGBoost: "XGBoostError: need to call fit or load_model beforehand"
What you expected to happen: I want to use dask_ml.xgboost.XGBRegressor with dask_ml.model_selection.HyperbandSearchCV.
Minimal Complete Verifiable Example:
Anything else we need to know?: Here the full error message:
_opt/conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py:285: UserWarning: The total space of parameters 8 is smaller than n_iter=81. Running 8 iterations. For exhaustive searches, use GridSearchCV. warnings.warn( /opt/conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py:285: UserWarning: The total space of parameters 8 is smaller than n_iter=34. Running 8 iterations. For exhaustive searches, use GridSearchCV. warnings.warn( /opt/conda/lib/python3.8/site-packages/sklearn/model_selection/_search.py:285: UserWarning: The total space of parameters 8 is smaller than n_iter=15. Running 8 iterations. For exhaustive searches, use GridSearchCV. warnings.warn(
XGBoostError Traceback (most recent call last)