interpretml / interpret

Fit interpretable models. Explain blackbox machine learning.
https://interpret.ml/docs
MIT License
6.04k stars 714 forks source link

Operations when merging EBM #532

Open JWKKWJ123 opened 3 weeks ago

JWKKWJ123 commented 3 weeks ago

Hi all, Now I am using ebm.merge( ). However, I haven't found the formula or pseudocode about the merge of EBMs. If I want to explain the merging of EBMs to others, is there a paper/formula/pseudocode that I can refer?

paulbkoch commented 3 weeks ago

Hi @JWKKWJ123 -- I'm not aware of a paper that describes the merge function. The effect of merge_ebms() is the same as traditional model ensembling, which you could do on any model. In the case of EBMs though, since the predictions from the EBM come from additive functions, instead of averaging the predictions after they've been made, you can through associativity move the addition back to the partial response functions. There's more complexity in practice because the bins from the EBMs don't necessarily line up. We handle it by making new bins that are a superset of all the bins between the EBMs being merged.

JWKKWJ123 commented 3 weeks ago

Dear Paul, Thank you very much! I am not familiar with decision tree. To my understanding, the 'partial response functions' = 'shape functions' = 'decision trees' in a EBM. So the merging of EBM is to merge each corresponding decision tree. And I guess the intercept is arithmetic mean. But I have no idea how to define the new bins in the merged EBM. So draw a pseudocode of merging EBM, an example of merging two decision trees. I would like to ask if my understanding is correct (I think it is wrong)?

image image

paulbkoch commented 3 weeks ago

Hi @JWKKWJ123 -- The range of x should be from -inf to +inf. Using your example mostly, let's say I have:

EBM1, bin_range:score [-inf, 1): 0 [1, 3): 1.5

EBM2, bin_range:score [-inf, 2): 0 [2, 4): 2

The new bins and scores will be: 4, +inf: 0 ------> (0+0)/2 4, +inf.75 ------> (1.5+0)/2
4, +inf.75 ------> (1.5+2)/2 4, +inf.5 ------> (3+2)/2 4, +inf: 2.75 ------> (3+2.5)/2

And if EBM1 and EBM2 have intercepts, then take the average of that too.

JWKKWJ123 commented 3 weeks ago

Hi @JWKKWJ123 -- The range of x should be from -inf to +inf. Using your example mostly, let's say I have:

EBM1, bin_range:score [-inf, 1): 0 [1, 3): 1.5 [3, +inf]: 3

EBM2, bin_range:score [-inf, 2): 0 [2, 4): 2 [4, +inf]: 2.5

The new bins and scores will be: [-inf, 1): 0 ------> (0+0)/2 [1, 2): 0.75 ------> (1.5+0)/2 [2, 3): 1.75 ------> (1.5+2)/2 [3, 4): 2.5 ------> (3+2)/2 [4, +inf]: 2.75 ------> (3+2.5)/2

And if EBM1 and EBM2 have intercepts, then take the average of that too.

Hi Paul, Thank you very much! Now I am much more clear about the merge of EBMs and trees. I have a more question: the merged tree will have more bins, if the number of bins exceed the max_bin, do the bins in the merged tree need to with each other? From my understanding the merge of bins could be combine the boundary and average the value.

paulbkoch commented 3 weeks ago

max_bins only applies when fitting EBMs. If you merge EBMs afterwards, there is no upper limit. max_bins only applies to continuous features too, BTW.

JWKKWJ123 commented 2 weeks ago

max_bins only applies when fitting EBMs. If you merge EBMs afterwards, there is no upper limit. max_bins only applies to continuous features too, BTW.

Hi Paul, Thank you very much! Now I am clear.