cerlymarco / shap-hypetune

A python package for simultaneous Hyperparameters Tuning and Features Selection for Gradient Boosting Models.
MIT License
567 stars 71 forks source link

Erratic behaviour #16

Closed mirix closed 2 years ago

mirix commented 2 years ago

Hi,

I am still running a series of experiments with shap-hypertune. Some sort of cross-validation with a number of stratified K-fold splits.

For each split, I generate random seeds like this: np.random.randint(4294967295).

A typical run goes like this (there is one for each split):

11 trials detected for ('num_leaves', 'n_estimators', 'max_depth', 'learning_rate')

trial: 0001 ### iterations: 00008 ### eval_score: 0.94737
trial: 0002 ### iterations: 00018 ### eval_score: 0.92481
trial: 0003 ### iterations: 00020 ### eval_score: 0.99248
trial: 0004 ### iterations: 00017 ### eval_score: 0.97744
trial: 0005 ### iterations: 00025 ### eval_score: 0.98496
trial: 0006 ### iterations: 00012 ### eval_score: 0.97744
trial: 0007 ### iterations: 00020 ### eval_score: 0.99248
trial: 0008 ### iterations: 00012 ### eval_score: 0.98496
trial: 0009 ### iterations: 00021 ### eval_score: 0.98496
trial: 0010 ### iterations: 00018 ### eval_score: 0.98496
trial: 0011 ### iterations: 00025 ### eval_score: 0.98496

11 trials detected for ('num_leaves', 'n_estimators', 'max_depth', 'learning_rate')

trial: 0001 ### iterations: 00025 ### eval_score: 0.96241
trial: 0002 ### iterations: 00038 ### eval_score: 0.97744
trial: 0003 ### iterations: 00037 ### eval_score: 0.97744
trial: 0004 ### iterations: 00015 ### eval_score: 0.96241
trial: 0005 ### iterations: 00002 ### eval_score: 0.81203
trial: 0006 ### iterations: 00018 ### eval_score: 0.96241
trial: 0007 ### iterations: 00016 ### eval_score: 0.96241
trial: 0008 ### iterations: 00011 ### eval_score: 0.91729
trial: 0009 ### iterations: 00038 ### eval_score: 0.97744
trial: 0010 ### iterations: 00022 ### eval_score: 0.96241
trial: 0011 ### iterations: 00021 ### eval_score: 0.96992

However, sometimes the eval_score drops dramatically.

But this does not seem to be your typical stochastic behaviour.

For instance, normally, f it drops for one split it will drop for all the subsequent splits. In spite of the fact that a new seed is (pseudo) randomly generated for each split at each stage:

skf = StratifiedKFold(n_splits=5, shuffle=True, random_state=np.random.randint(4294967295))

    clf_lgbm = LGBMClassifier(boosting_type='rf',
                         random_state=np.random.randint(4294967295),

    model = BoostRFA(    
    sampling_seed=np.random.randint(4294967295),    

In other cases the number of iterations stays constant for each run:

11 trials detected for ('num_leaves', 'n_estimators', 'max_depth', 'learning_rate')

trial: 0001 ### iterations: 00001 ### eval_score: 0.69173
trial: 0002 ### iterations: 00001 ### eval_score: 0.7594
trial: 0003 ### iterations: 00001 ### eval_score: 0.69173
trial: 0004 ### iterations: 00001 ### eval_score: 0.69173
trial: 0005 ### iterations: 00001 ### eval_score: 0.79699
trial: 0006 ### iterations: 00001 ### eval_score: 0.69173
trial: 0007 ### iterations: 00001 ### eval_score: 0.69173
trial: 0008 ### iterations: 00001 ### eval_score: 0.7594
trial: 0009 ### iterations: 00001 ### eval_score: 0.69173
trial: 0010 ### iterations: 00001 ### eval_score: 0.69173
trial: 0011 ### iterations: 00001 ### eval_score: 0.69173

11 trials detected for ('num_leaves', 'n_estimators', 'max_depth', 'learning_rate')

trial: 0001 ### iterations: 00001 ### eval_score: 0.82707
trial: 0002 ### iterations: 00001 ### eval_score: 0.82707
trial: 0003 ### iterations: 00001 ### eval_score: 0.82707
trial: 0004 ### iterations: 00001 ### eval_score: 0.82707
trial: 0005 ### iterations: 00001 ### eval_score: 0.81955
trial: 0006 ### iterations: 00001 ### eval_score: 0.82707
trial: 0007 ### iterations: 00001 ### eval_score: 0.81955
trial: 0008 ### iterations: 00001 ### eval_score: 0.81955
trial: 0009 ### iterations: 00001 ### eval_score: 0.82707
trial: 0010 ### iterations: 00001 ### eval_score: 0.82707
trial: 0011 ### iterations: 00001 ### eval_score: 0.82707

If you re-run the script, you typically observe the normal behaviour again.

cerlymarco commented 2 years ago

Hi, thanks for your feedback but I don't think that this should be considered as an erratic behavior.

First of all, each split is not related to each other and this should not assure that all the folds behave in the same manner (especially with real data or unbalanced data).

RFA is more unstable than RFE considering how it works (starting from zeros features and adding them recursively).

Finally, results are affected by the number of iterations and the possible callbacks.

If you have some empirical evidence about this behavior (i.e. a possible bug in the implementation) don't hesitate to let me know.