imbs-hl / ranger

A Fast Implementation of Random Forests
http://imbs-hl.github.io/ranger/
768 stars 193 forks source link

Support decision path #460

Open talegari opened 4 years ago

talegari commented 4 years ago

Hi Marvin,

I would be be great idea to provide decision path of an test observation (Sequence of node ids the observation passes through while prediction). Adding decision_path to type argument in ranger::predict.ranger seems like the intuitive position for it. This is helpful in understanding the data better beyond their membership in terminal nodes. Let me know what you think.

Reference: scikit-learn implementation

mnwright commented 4 years ago

That's a good idea. What should be the return value? We could save a list of node IDs for every observation and tree but a binary matrix for each tree (as in sklearn) could also work because node IDs are always increasing with the depth.

talegari commented 4 years ago

IMHO, sparse matrix (like sklearn) is a good choice as it might allow fast computations to process them.

markusloecher commented 1 year ago

Was this ever implemented ? I would be just as interested.

talegari commented 1 year ago

@markusloecher A decision path(in a decision tree) is a function of terminal node alone. Hence, it can be inferred from ranger's predict method with type = 'terminalNodes'. Here is a code snippet.