microsoft / FLAML

A fast library for AutoML and tuning. Join our Discord: https://discord.gg/Cppx2vSPVP.
https://microsoft.github.io/FLAML/
MIT License
3.75k stars 495 forks source link

custom_metric() function error #1277

Open ct-rparrondo opened 4 months ago

ct-rparrondo commented 4 months ago

Hi! I defined my custom_metric function following your documentation:

`def custom_metric(X_val, y_val, estimator, X_train, y_train, weight_val=None, weight_train=None):

start = time.time()
y_val_pred = estimator.predict_proba(X_val)
pred_time = (time.time() - start) / len(X_val)
val_mcc = metrics.matthews_corrcoef(y_val, y_val_pred, sample_weight=weight_val)

y_train_pred = estimator.predict_proba(X_train)
train_mcc = metrics.matthews_corrcoef(y_train, y_train_pred, sample_weight=weight_train)

return val_mcc, {"val_mcc": val_mcc, "train_mcc": train_mcc, "pred_time": pred_time}`

And I call the function as follows:

automl = AutoML() automl.fit(X_train=X_train, y_train=y_train_select, custom_metric=custom_metric(X_val=X_val, y_val=y_val_select, estimator=automl, X_train=X_train, y_train=y_train_select), task="classification", verbose=3, n_jobs=-1, time_budget=60)

However, the following error arises:

`ValueError Traceback (most recent call last) Cell In[13], line 101 98 # Create an AutoML class for tuning the hyperparameters and select the best model 99 automl = AutoML() 100 automl.fit(X_train=X_train, y_train=y_train_select, --> 101 custom_metric=custom_metric(X_val=X_val, y_val=y_val_select, estimator=automl, X_train=X_train, y_train=y_train_select), 102 task="classification", verbose=3, n_jobs=-1, 103 time_budget=60, log_file_name=f'{safety_endpoint[3:]}.log', seed=42) 105 logging.info('Best estimator:', automl.best_estimator) 106 logging.info('Best hyperparmeter config:', automl.best_config)

Cell In[6], line 6 4 y_val_pred = estimator.predict_proba(X_val) 5 pred_time = (time.time() - start) / len(X_val) ----> 6 val_mcc = metrics.matthews_corrcoef(y_val, y_val_pred, sample_weight=weight_val) 8 y_train_pred = estimator.predict_proba(X_train) 9 train_mcc = metrics.matthews_corrcoef(y_train, y_train_pred, sample_weight=weight_train)

File /opt/conda/envs/raquel_tfm/lib/python3.11/site-packages/sklearn/metrics/_classification.py:911, in matthews_corrcoef(y_true, y_pred, sample_weight) 848 def matthews_corrcoef(y_true, y_pred, *, sample_weight=None): 849 """Compute the Matthews correlation coefficient (MCC). 850 851 The Matthews correlation coefficient is used in machine learning as a (...) 909 -0.33... 910 """ --> 911 y_type, y_true, y_pred = _check_targets(y_true, y_pred) 912 check_consistent_length(y_true, y_pred, sample_weight) 913 if y_type not in {"binary", "multiclass"}:

File /opt/conda/envs/raquel_tfm/lib/python3.11/site-packages/sklearn/metrics/_classification.py:88, in _check_targets(y_true, y_pred) 86 check_consistent_length(y_true, y_pred) 87 type_true = type_of_target(y_true, input_name="y_true") ---> 88 type_pred = type_of_target(y_pred, input_name="y_pred") 90 y_type = {type_true, type_pred} 91 if y_type == {"binary", "multiclass"}:

File /opt/conda/envs/raquel_tfm/lib/python3.11/site-packages/sklearn/utils/multiclass.py:301, in type_of_target(y, input_name) 294 valid = ( 295 (isinstance(y, Sequence) or issparse(y) or hasattr(y, "array")) 296 and not isinstance(y, str) 297 or is_array_api 298 ) 300 if not valid: --> 301 raise ValueError( 302 "Expected array-like (array or non-string sequence), got %r" % y 303 ) 305 sparse_pandas = y.class.name in ["SparseSeries", "SparseArray"] 306 if sparse_pandas:

ValueError: Expected array-like (array or non-string sequence), got None`

How can I fix it? Thank you for your time!!

P.S. Some of the Jupyter Notebooks that you provide in the tutorials folder of the repo don't work and aire the same error (mentioned in issue #1217)

Programmer-RD-AI commented 1 month ago

According to #1217 you referred to, that issue is fixed with a new version v2.1.1 of FLAML.. Please check if that fixes this issue Thank you