Feature Cycling as an option instead of Random Feature Sampling

Summary

Have the option such that the model can select features cyclically, instead of simply randomly selecting the features.

See here for initial discussion on LightGBMs and EBMs.

Motivation

Model explain ability is becoming ever more important in the ML space. LightGBM can take advantage of some of the methods used by Explainable Boosted Machines to make models more interpretable. One of the features of EBMs is build shallow, single-feature trees (currently possible in LightGBM by toying with parameters). However, these trees are boosted in a cyclic fashion.

So, for example,

So for a model with 3-features:

Tree 1 - feature 1 Tree 2 - feature 2 Tree 3 - feature 3 Tree 4 - feature 1 Tree 5 - feature 2 Tree 6 - feature 3 ...

This allows the model to ensure it gains information from colinear features that might be equally as important. When comparing this to a small number of deeper trees, it is easy to get a bias (lots of gain) in the first few features the are randomly selected.

Description

The feature would be used to make LightGBM more interpretable and results more comparable to EBMs. This will allow users to make informed decision on interpretability vs model performance.

Additionally, in certain cases, the model maybe be more robust at inference time if colinear features are missing.

References

A great conceptual video explanation.

InterpretML: A Unified Framework for Machine Learning Interpretability InterpretML: A toolkit for understanding machine learning models InterpretMLs Explainable Boosting Machine

microsoft / LightGBM