csinva / imodels

Interpretable ML package 🔍 for concise, transparent, and accurate predictive modeling (sklearn-compatible).
https://csinva.io/imodels
MIT License
1.35k stars 120 forks source link

RuleFitClassifier(tree_generator = GradientBoostingClassifier()) not working as per documentation #133

Open Manuelhrokr opened 1 year ago

Manuelhrokr commented 1 year ago

Hi,

When using RuleFitClassifier(tree_generator = GradientBoostingClassifier()) with a GradientBoostingClassifier() object fitted and optimized separately via Scikitlearn API, it returns the next error when fitting RuleFitClassifier(tree_generator = GradientBoostingClassifier()):

ValueError: n_estimators=1 must be larger or equal to estimators_.shape[0]=100 when warm_start==True

When inspecting whats inside RuleFitClassifier(tree_generator = GradientBoostingClassifier()) after fitting the model, the GradientBoostingClassifier() is completely modified to other parameters different than those optimized before fitting RuleFitClassifier(), i.e., GradientBoostingClassifier(max_leaf_nodes=4, n_estimators=1, random_state=0, warm_start=True). Not sure why these parameters (from the GradientBoostingClassifier()) are changed inside the RuleFitClassifier() object.

If RuleFitClassifier(tree_generator = None), everything works well.

As per documentation:

tree_generator : Optional: this object will be used as provided to generate the rules. This will override almost all the other properties above. Must be GradientBoostingRegressor(), GradientBoostingClassifier(), or RandomForestRegressor()

Here is the closest solution I found in Issue #34, however the behavior is not clear.

Any help will be highly appreciated.

Many thanks!