Closed ThomasBourgeois closed 3 years ago
Hi Thomas! Yeah that sounds quite suboptimal. Just to troubleshoot the issue, have you changed the default hyperparameters of the QuantileRegressionForest
? What happens if you try setting min_samples_leaf=100
or max_leaf_nodes=100
, say?
I did not change the defaults. Actually the predict method took something like 3 hours to give a result. And in the end the quantiles were exactly the same .. meaning I could not make it work : upper and lower bound were the same.
Yeah I think what happened there was the forest producing trees with a single element in their leaves, which means that the quantiles will all be trivial (=identical). If you try changing one of the two arguments I mentioned above, then hopefully it should work! I guess the solution here could be just to change the defaults.
Oh ok, I thought the quantiles were done by doing a distribution over the different predictions of the the different trees, not over the distributiion of the leafs
I'm using the predict method on around 35 000 samples with 100 estimators : it's been running for 5 minutes already and going, on 4 cpus.. The train method is extremly long too : on 300 000 samples with 100 estimators, it took around 2 hours at least. Far from scikit-learn
Thanks for the lib though ! But right now, hard to use for me right now due to this slowness.