ZeroDivisionError in merge_ebms

jfleh commented 6 months ago

I am trying to merge two ebms (classifier or regressor, does not matter which one) and I get the following error:

Traceback (most recent call last):
  File "/code/trainingmanagerapi.py", line 725, in multiple_local_training
    fitted_model = merge_ebms([fitted_model, ebm2])
  File "/usr/local/lib/python3.9/site-packages/interpret/glassbox/_ebm/_merge_ebms.py", line 719, in merge_ebms
    ) = process_terms(n_classes, ebm.bagged_scores_, ebm.bin_weights_, ebm.bag_weights_)
  File "/usr/local/lib/python3.9/site-packages/interpret/glassbox/_ebm/_utils.py", line 235, in process_terms
    score_mean = np.average(scores, weights=weights)
  File "<__array_function__ internals>", line 180, in average
  File "/usr/local/lib/python3.9/site-packages/numpy/lib/function_base.py", line 547, in average
    raise ZeroDivisionError(
ZeroDivisionError: Weights sum to zero, can't be normalized

the two models have been fitted on the exact same dataset.

paulbkoch commented 6 months ago

Hi @jfleh -- Are you using sample weights when generating either of the models? When fitting EBMs, we sum up the sample weights for all the samples within each bin of each term, and we put that information in the ebm.binweights attribute of the model. The exception above is saying that the total of the sample weights for some term is zero (the total, not just a zero for one of the bins). I could potentially see a model built with extremely small sample weights would do this naturally, but the conditions would have to be almost impossibly special. The more likely scenario is that there's a bug somewhere in the merge_ebms function, probably having to do with merging pairs where a spurious term is somehow created during the merge. I can look through merge_ebms and see if I can figure that out, but it would be easier to have some more information about the model first. If the model is private and cannot be posted here, can you look at the binweights attributes of your models and see if any of the terms have all zeros in one of their term weights. If your model is public, could you use the ebm.to_json(FILE_NAME) function to export a JSON representation of the models and post them here or email them to interpret@microsoft.com

Documentation link: https://interpret.ml/docs/ExplainableBoostingClassifier.html#interpret.glassbox.ExplainableBoostingClassifier.to_json

jfleh commented 6 months ago

Hi @paulbkoch, thanks for the response. I do indeed see lots of zeroes in the binweights, I also noticed that I was trying to combine two models that were exactly identical (as a result of being fitted on the same dataset with the same random_state). I am attaching the model. The model has been created with default parameters and is fit on synthetically created data. The predictors are independent of the targets, so there is not actually anything that can be learned on this dataset. I am curious if it is something with the model or the fact that the two models are identical that causes this problem. model1.txt

jfleh commented 5 months ago

I am still getting the same error, now also with models that are trained on real data that should be able to pick up effects.

paulbkoch commented 5 months ago

I've pushed a fix for this issue which will be included in our next release. For details see: https://github.com/interpretml/interpret/commit/0c6c98552515845214849e0fb8a97c83a5172989

In the meantime, you can avoid this issue by not merging models that have features with only 1 value. Such features are entirely useless anyway, so removing them should not affect the model's performance. You can do this with:

ebm.remove_terms([i for i, scores in enumerate(ebm.term_scores_) if np.sum(np.abs(scores)) == 0])

Thanks @jfleh for reporting this. It was a good bug to fix.

interpretml / interpret

ZeroDivisionError in merge_ebms #485