ray-project / xgboost_ray

Distributed XGBoost on Ray
Apache License 2.0
133 stars 34 forks source link

add support for multi-output prediction #286

Open yc2984 opened 11 months ago

yc2984 commented 11 months ago

Currently, xgboost_ray doesn't support multi-output, neither the two options work.

  1. providing a list of labels: RayDMatrix(path, label=label_cols, filetype=RayFileType.PARQUET)
  2. providing a concrete data frame with multiple columns. RayDMatrix(data=df[feature_columns], label=df[target_columns])

However the second option is supported by the original xgboost package. Here is an issue tracking for future development for multi-output: https://github.com/dmlc/xgboost/issues/9043. A few distributed options are mentioned, but not ray, is there a plan to develop this feature soon for xgboost_ray as well? cc @Yard1

I also asked this in ray discussion forum: https://discuss.ray.io/t/does-xgboost-ray-supports-multi-output-many-y-labels/11383

Yard1 commented 11 months ago

cc @krfricke