Closed r3v1 closed 3 years ago
I ported the code from R here: https://github.com/imbs-hl/ranger/blob/e8b05f47892bb4968c4e6057f68b35bcd0b52225/R/ranger.R#L972 and I think it's just a mistake of casting to int in python here: https://github.com/crflynn/skranger/blob/aa2b5540b0b386321610ba10a449d11281a60e2e/skranger/ensemble/ranger_forest_regressor.py#L229
I noticed this too originally but for some reason didn't think twice about it. I think we can just remove the astype
and it should be corrected.
I also tried what you have said about removingastype
but it it raises IndexError: arrays used as indices must be of integer (or boolean) type
Right that's not it actually I need to take a closer look.
It's actually here where it creates the array: https://github.com/crflynn/skranger/blob/aa2b5540b0b386321610ba10a449d11281a60e2e/skranger/ensemble/ranger_forest_regressor.py#L311
It's being created as an integer array, so the subsequent steps are doing int coercion leading to the strange quantile results.
Try changing this to
node_values = 0.0 * terminal_nodes
That's it! Thanks
Hi, I've been testing the Ranger Forest Regressor, and I noticed a strange beahaviour when predicting quantiles: it outputs int like values or rounded to .5 (
19.
,2.
,3.5
) and not the expected float (19.65165
, etc)To reproduce this:
Then, I try to predict with quantiles:
Without quantiles:
In fact, the dataset used contains thousands of instances, so its not a problem regarding the size of the dataset.
Thanks!