What label= should be for the regression model?

Roffild commented 3 years ago

mylbl = numpy.array([[0.], [0.], [1.], [0.], [1.]], dtype=numpy.float32)
mylbl.ndim == 2

xgboost.DMatrix(label=)

label : list, numpy 1-D array or cudf.DataFrame, optional Label of the training data.
label=None xgboost.core.XGBoostError: src/objective/regression_obj.cu:57: Check failed: preds.Size() == info.labels_.Size() (61872 vs. 0) : labels are not correctly providedpreds.size=61872, label.size=0, Loss: binary:logistic
label=mylbl - OK

catboost.Pool(label=)

label : list or numpy.ndarrays or pandas.DataFrame or pandas.Series, optional (default=None) Label of the training data. If not None, giving 1 or 2 dimensional array like data with floats.
label=None File "catboost\core.py", line 976, in _build_train_pool raise CatBoostError("Label in X has not been initialized.")
label=mylbl - OK

lightgbm.Dataset(label=)

label : list, numpy 1-D array, pandas Series / one-column DataFrame or None, optional (default=None) Label of the data.
label=None - OK
label=mylbl File "lightgbm\basic.py", line 93, in list_to_1d_numpy raise TypeError("Wrong type({0}) for {1}.\n" TypeError: Wrong type(ndarray) for label. It should be list, numpy 1-D array or pandas Series

Can you accept a single standard?

Parameter description is incorrect.

Roffild commented 3 years ago

https://github.com/dmlc/xgboost/issues/6786 https://github.com/catboost/catboost/issues/1623 https://github.com/microsoft/LightGBM/issues/4115

Roffild commented 3 years ago

Used training.

Roffild commented 3 years ago

I am iterating over the parameters. It's easier for me to set label. Algorithm-level error is preferred over global error.

In the description 1D array. But for regression there must be ND.

andrey-khropov commented 3 months ago

Parameter description is incorrect.

What is incorrect about it (for CatBoost)?

label : list or numpy.ndarrays or pandas.DataFrame or pandas.Series, optional (default=None) Label of the training data. If not None, giving 1 or 2 dimensional array like data with floats.

Single and two-dimensional arrays are accepted as specified in the description.

label=None File "catboost\core.py", line 976, in _build_train_pool raise CatBoostError("Label in X has not been initialized.")

This error only happens if you try to use the dataset in Pool for training where label data is necessary, if you use it for training with pairs (where label data is optional) or for prediction label=None is perfectly valid and no error occurs.

catboost / catboost

What label= should be for the regression model? #1623