google / yggdrasil-decision-forests

A library to train, evaluate, interpret, and productionize decision forest models such as Random Forest and Gradient Boosted Decision Trees.
https://ydf.readthedocs.io/
Apache License 2.0
498 stars 53 forks source link

Obtaining or predicting leaf index using ydf #117

Closed goleng closed 4 months ago

goleng commented 4 months ago

Hello 👋

Thanks for the clear and easy to digest documentation and guides on how to leverage ydf for different ml tasks.

I'll like to know if ydf supports or provide mechanisms for extracting tree index as provided in other libraries such xgboost, catboost and lightgbm. This is good for situations where one needs to use gbm to transform the data.... feature extraction.

I have read the documentation but it's not clear to me if this feature exists. One feature that I came across (although I haven't played with yet) is the _get_alltrees method in the advanced section of the documentation. I'm not sure if it provides this functionality.

Thanks in advance.

rstz commented 4 months ago

Hi,

thank you for the kind words :)

I think you might be looking for the model.predict_leaves, which returns the index of the active leaf in every tree for each example - does this fit your needs?

goleng commented 4 months ago

@rstz Thanks a lot for your timely response. It really helps. That's exactly what I'm looking for.