Open longdahl opened 2 years ago
The current implementation is how you would naturally define interaction constraints: Each leaf node with see only variables from one variable group allowed to interact. This generates a boosted trees model that respects the interaction constraints. The original XGBoost implementation did not stick to this rule. Following this issue https://github.com/dmlc/xgboost/issues/7115, XGBoost started to forbid overlapping feature sets in the constraints (as this is the only situation where their original implementation was strange).
Thanks for your reply. I thought the original XGBoost definition of interaction constraints might also have had some very interesting applications, but if its in conflict with the definition of interaction constraints it would of course be a whole different feature all together.
Description
Bug report
interaction constraints* not only limits consequent splits, but all further splits in the tree
Short preamble: This is assuming that the behavior is intended to mirror the behavior of ‘feature interaction constraints’ in xgboost. I assume this is the case as the original feature request refers to xgboost version: https://github.com/microsoft/LightGBM/issues/2884. If the lightgbm implementation is not intended to follow the same logic, I should probably submit this as a feature request instead.
The error in details: When you specify an interaction constraint in LightGBM 3.3.2 it not only prevents the immediate consequent split, but all future splits in the same tree. Specifically, given the constraints [[0,1],[1,2]] a tree that’s starts with a split on feature 0 cannot have any splits containing the feature 2 in ANY further splits in the tree. Of course, a split on feature ‘2’ directly after feature ‘0’ should not be allowed, but it should allow for a split on ‘0’ in the followed by a split on ‘1’ and then finally a split on feature ‘2’. Again, this assumes that the implementation follows the xgboost documentation. See the finale graph in the xgboost documentation that highlights that such pathways are legal: https://xgboost.reagmadthedocs.io/en/stable/tutorials/feature_interaction_constraint.html
*https://lightgbm.readthedocs.io/en/latest/Parameters.html#interaction_constraints
Reproducible example
created dummy dataset for test
training model
Dumping model object and iterating over each tree
Output of above code block:
In summary interaction constraints not only limits consequent splits, but all further splits in the tree
While the above is of course anecdotal, it would be highly improbable that no such tree exists. I have also observed the same issue on other datasets.
Environment info
LightGBM 3.3.2