grf-labs / grf

Generalized Random Forests
https://grf-labs.github.io/grf/
GNU General Public License v3.0
970 stars 248 forks source link

feature weights? #1097

Open tibshirani opened 2 years ago

tibshirani commented 2 years ago

Dear grf team

Thanks for your great package.! I have a research project for which feature_weights are essential. That is, i want to pass a non-neg vector of length the # of features, and the tree growing algorithm would use those to determine the probability of choosing a feature at for splitting. xgboost in python has such an argument. Would it be possible to add this to grf?

thanks

Rob Tibshirani

erikcs commented 2 years ago

Hi @tibshirani, I think @jtibshirani and @swager may have intentionally left that out to keep affinity with the GRF theory:

To more closely match the theory in the GRF paper, the number of variables considered is actually drawn from a poisson distribution with mean equal to mtry. A new number is sampled from the distribution before every tree split.

If you have integer weights, then a brute force first-pass at this would be to just duplicate those columns of the X matrix, if you want to try that first?