andosa / treeinterpreter

BSD 3-Clause "New" or "Revised" License
745 stars 140 forks source link

How is joint contribution calculated over deep tree? How to set max number of elements in joint set, i.e. to doublets or triplets, over a deep tree? #22

Open lynnyi opened 5 years ago

lynnyi commented 5 years ago

I was able to calculate joint contributions over a random forest model trained with max depth = 30 as a binary classifier. My understanding is that, if the path of a particular instance is is 1->2->3->4->5->...->30 in the tree, the joint contributions should be calculated for (1,2), (2,3), (3,4)..., (1,2,3), (2,3,4), (3,4,5)... (1,2,3,4), (2,3,4,5), etc. i.e. the number of joint contributions scales as fibonacci_n where n = depth of tree / depth of path.

For a single instance, I get thousands of joint contributions, but the contribution groups are not what I expect, i.e. the sets of nodes are so disjointed, it doesn't seem to follow one path through the tree, so it can't seem to pertain to only one instance.

So questions are: 1) How is joint contribution calculated for a deep tree? 2) Is there a way to specify a max number of nodes per joint contribution?