Open kawaho opened 2 years ago
it should be possible. Just try and see.
hep_ml has more general loss format, see here: https://github.com/arogozhnikov/hep_ml/blob/master/hep_ml/losses.py#L88-L138
you need init, fit, and prepare_tree_params within xgboost.
Difference with other methods is its ability to remember additional characteristics of observation (such as control variables).
Possibility of such factors is ignored by most loss functions I'm aware about: they assume that loss for each observation does not depend on others.
So depending on implementation in xgboost (i.e. if it preserves order of observations on each call) you can just init
& fit
outside of xgboost, then wrap prepare_tree_params
and pass to xgboost loss.
That said, I'd start from checking that you're really bottlenecked by tree building, not loss computation (because flatness computation is rather resource-consuming). If so - you'll see no benefit from moving to xgboost.
Hi authors of hep_ml, I am wondering if there is an easy way to use the loss function (in particular the BinFlatnessLossFunction) from this package in XGBoost since XGBoost support custom loss function in the typical grad, hess format (https://xgboost.readthedocs.io/en/stable/tutorials/custom_metric_obj.html). This could help to improve the speed of training since hep_ml does not support multithreading (please correct me if I am wrong).
Thanks, Andy