imbs-hl / ranger

A Fast Implementation of Random Forests
http://imbs-hl.github.io/ranger/
776 stars 194 forks source link

Feature request: weighted quantile random forest regression #745

Open nikosGeography opened 1 month ago

nikosGeography commented 1 month ago

Hi,

I came across the paper Weighted Quantile Regression Forests for Bimodal Distribution Modeling: A Loss Given Default Case. Basically, the authors performed a quantile random forest (qRF) regression but they used performance-based weights for quantile estimation, so that trees with better performance weigh more. They added some more things but their methodology is described on p7 to p9 with p9 showing the pseudocode with I attached below:

image (Gostkowski & Gajowniczek, 2020)

They used the quanterForest package for their analysis but I believe ranger should do the job just fine. I contacted the authors of the paper and their reply was: In general this package (quanterForest) returns matrix with probability for observations in each tree. Based on that someone can write his own function calculating final prediction. Instead of simple average he can add weights as well.

Here is my question/request: Is weighted qRF something that you are interested to implement in your package as an addition to the "classical" qRF?

If you don't have access to the paper, please let me know and I'll share it so you can have a look in more detail at their implementation and the steps they followed.

Thank you.

mnwright commented 1 month ago

I haven't checked what they are doing exactly, but I think it might already be possible, since the quantile prediction supports a user-supplied function that can be used for weighting:

library(ranger)
rf <- ranger(mpg ~ ., mtcars[1:26, ], quantreg = TRUE)
weights <- runif(rf$num.trees, 0, 1)
predict(rf, mtcars[27:32, ], type = "quantiles", what = function(x) {quantile(weights*x, probs = c(.1, .5, .9))})$predictions

Here I just used some random weights but maybe this works with the right weights?