imbs-hl / ranger

A Fast Implementation of Random Forests
http://imbs-hl.github.io/ranger/
774 stars 194 forks source link

Default value of num.random.splits #417

Closed l47y closed 4 years ago

l47y commented 5 years ago

Hey Marvin,

I was looking for the default value of num.random.splits when using extratrees, but could not find it. I was searching in the C++ files and found const uint DEFAULT_NUM_RANDOM_SPLITS = 1; but I am not sure how everything is playing together there. Does this argument default to 1 when not specifying it further? And if yes: Does the 1 mean, that there is one split done per tree, or one split per tree and per variable?

Thanks alot in advance :-)

Edit: Changing num.random.splits to 2 makes it a little bit slower, so probably 1 is the default value.

mnwright commented 5 years ago

Yes, it's 1. See here: https://github.com/imbs-hl/ranger/blob/375d92f6c8cba37c93f3b10d55a65b6596dd5a63/R/ranger.R#L206 or in the ranger help in R.

And if yes: Does the 1 mean, that there is one split done per tree, or one split per tree and per variable?

In each node, mtry variables are randomly selected and num.random.splits random split points per variable are tried.

mnwright commented 4 years ago

We now include num.random.splits in the ranger output, see https://github.com/imbs-hl/ranger/commit/401f072bc81a48a91abbcf2b8eac5b50fe7b6af5.