Open alexk101 opened 1 year ago
Hi @alexk101 , are you using FLAML alongside an existing ML framework?
Yes. Currently we are using LightGBM for our models. The way that I have been working around this has been by doing the hyperparameter optimization outside of DVCLive, and then just retraining a LightGBM model, which is supported, with the best values, as shown below.
from flaml import AutoML
from dvclive import Live
from dvclive.lgbm import DVCLiveCallback
import lightgbm as lgb
automl = AutoML(**auto_ml_opts)
automl.fit(
X_train=X_train.to_numpy(),
y_train=y_train.to_numpy(),
X_val=X_test.to_numpy(),
y_val=y_test.to_numpy(),
time_budget=TIME,
estimator_list=['lgbm'],
task=model_type
)
starting_points = automl.best_config_per_estimator['lgbm']
with Live(str(output / 'dvclive')) as live:
gbm = lgb.LGBMRegressor(
**starting_points,
gpu_platform_id = 1,
gpu_device_id = 0
)
gbm.fit(X_train.to_numpy(), y_train.to_numpy(),
eval_set=[(X_test.to_numpy(), y_test.to_numpy())],
callbacks=[
lgb.early_stopping(20),
DVCLiveCallback(live=live, save_dvc_exp=True)
],
eval_metric="rmse")
This is not ideal, since it doesn't capture the hyperparameter optimization, but it is minimally sufficient for what we are doing at the moment. I saw that Optuna was supported, so I thought it might be reasonable to request FLAML as another option.
This is not ideal, since it doesn't capture the hyperparameter optimization, but it is minimally sufficient for what we are doing at the moment. I saw that Optuna was supported, so I thought it might be reasonable to request FLAML as another option.
That makes sense, I will see what kind of integration could be done.
I was asking because even for Optuna I personally find it more convenient to either manually use DVCLive or use the MLFramework integration directly.
For example, in FLAML looks like you can pass a callback to fit
like :
from flaml import AutoML
from dvclive import Live
from dvclive.lgbm import DVCLiveCallback
import lightgbm as lgb
automl = AutoML(**auto_ml_opts)
automl.fit(
X_train=X_train.to_numpy(),
y_train=y_train.to_numpy(),
X_val=X_test.to_numpy(),
y_val=y_test.to_numpy(),
time_budget=TIME,
estimator_list=['lgbm'],
task=model_type,
callbacks=[DVCLiveCallback(save_dvc_exp=True)]
)
And every iteration of the hyperparameter optimization would create a DVC experiment.
It also looks like you could customize it further by using a custom class like in https://microsoft.github.io/FLAML/docs/Examples/AutoML-for-LightGBM/#create-a-customized-lightgbm-learner-with-a-custom-objective-function
Awesome. I will look into those custom callbacks further. Thanks @daavoo !
It would be great to document this in https://dvc.org/doc/dvclive/ml-frameworks.
I would love to see integration of Microsoft's FLAML hyperparameter optimizer!