Open kiran-vj opened 2 months ago
Out of curiosity, how do you tune such models? I imagine the HPO search space would be extremely large?
You're right, the HPO search space would be large. But we can approach it as a 2-step process. In the first step the HPO search would be limited to Uniform MCW for all the features, and in the second step we try to tune just the MCW parameter only for the problematic features identified by the modellers.
@trivialfis We wish to contribute this feature by helping in the development. What would the process look like for getting this approved and merged into XGBoost?
Any thoughts on this?
Apologies for the slow reply. It's not a trivial change. You can find the parameter definition here: https://github.com/dmlc/xgboost/blob/cb54374550002efa7e4f2279c8941b4c7c196188/src/tree/param.h#L25 If you turn it into a vector, it can be parsed as JSON similar to https://github.com/dmlc/xgboost/blob/cb54374550002efa7e4f2279c8941b4c7c196188/src/tree/param.cc#L85 By searching the parameter name, you can find where it's used to prevent split. The split candidate has split feature index. I'm not entirely sure about the GPU implementation yet.
In practical modelling scenarios there are often some very key variables which are very sparsely populated- which forces the modelers to set lower min_child_weight values to ensure these variables are incorporated in the model, but this can often lead to overfitting on other variables.
To avoid such scenarios what we propose is having the flexibility to set different min_child_weight values for each feature.